perm filename V2ANS5.TEX[TEX,DEK] blob
sn#522826 filedate 1980-07-16 generic text, type T, neo UTF8
%folio 821 galley 5 (C) Addison-Wesley 1978 *
\ansno 13. For Case 1, use a Type-1 chain followed by $2↑{A+C}
+ 2↑{B+C} + 2↑A + 2↑B$; or use the factor method. For Case 2,
use a Type-2 chain followed by $2↑{A+C+1} + 2↑{B+C} + 2↑A +
2↑B$. For Case 3, use a Type-5 chain followed by addition of
$2↑A + 2↑{A-1}$, or use the factor method. For Case 4, $n =
135 \cdot 2↑D$, so we may use the factor method.
\ansno 14. (a)\9 It is easy to verify that steps $r - 1$ and $r
- 2$ are not both small, so let us assume that step $r - 1$
is small and step $r - 2$ is not. If $c = 1$, then $λ(a↓{r-1})
= λ(a↓{r-k})$, so $k = 2$; and since $4 ≤ \nu (a↓r) = \nu (a↓{r-1})
+ \nu (a↓{r-k}) - 1 ≤ \nu (a↓{r-1}) + 1$, we have $\nu (a↓{r-1})
≥ 3$, making $r - 1$ a star step (lest $a↓0$, $a↓1$, $\ldotss$, $a↓{r-3}$,
$a↓{r-1}$ include only one small step). Then $a↓{r-1} = a↓{r-2}
+ a↓{r-q}$ for some $q$, and if we replace $a↓{r-2}$, $a↓{r-1}$,
$a↓r$ by $a↓{r-2}$, $2a↓{r-2}$, $2a↓{r-2} + a↓{r-q} = a↓r$, we obtain
another counterexample chain in which step $r$ is small; but
this is impossible. On the other hand, if $c ≥ 2$, then $4 ≤
\nu(a↓r) ≤ \nu (a↓{r-1}) + \nu (a↓{r-k}) - 2 ≤ \nu (a↓{r-1})$;
hence $\nu (a↓{r-1}) = 4$, $\nu (a↓{r-k}) = 2$, and $c = 2$. This
leads readily to an impossible situation by a consideration
of the six types in the proof of Theorem↔B.
(b)\9 If $λ(a↓{r-k}) < m - 1$, we have $c ≥ 3$, so
$\nu (a↓{r-k}) + \nu (a↓{r-1}) ≥ 7$ by (22); therefore both
$\nu (a↓{r-k})$ and $\nu (a↓{r-1})$ are $≥3$. All small steps
must be $≤r - k$, and $λ(a↓{r-k}) = m - k + 1$. If $k ≥ 4$,
we must have $c = 4$, $k = 4$, $\nu (a↓{r-1}) = \nu (a↓{r-4}) =
4$; thus $a↓{r-1} ≥ 2↑m + 2↑{m-1} + 2↑{m-2}$, and $a↓{r-1}$
must equal $2↑m + 2↑{m-1} + 2↑{m-2} + 2↑{m-3}$; but $a↓{r-4}
≥ {1\over 8}a↓{r-1}$ now implies that $a↓{r-1} = 8a↓{r-4}$.
Thus $k = 3$ and $a↓{r-1} > 2↑m + 2↑{m-1}$. Since $a↓{r-2} <
2↑m$ and $a↓{r-3} < 2↑{m-1}$, step $r - 1$ must be a doubling;
but step $r - 2$ is a nondoubling, since $a↓{r-1} ≠ 4a↓{r-3}$.
Furthermore, since $\nu (a↓{r-3}) ≥ 3$, $r - 3$ is a star step;
and $a↓{r-2} = a↓{r-3} + a↓{r-5}$ would imply that $a↓{r-5}
= 2↑{m-2}$, hence we must have $a↓{r-2} = a↓{r-3} + a↓{r-4}$.
As in a similar case treated in the text, the only possibility
is now seen to be $a↓{r-4} = 2↑{m-2} + 2↑{m-3}$, $a↓{r-3} = 2↑{m-2}
+ 2↑{m-3} + 2↑{d+1} + 2↑d$, $a↓{r-1} = 2↑m + 2↑{m-1} + 2↑{d+2}
+ 2↑{d+1}$, and even this possibility is impossible.
\ansno 16. $l↑B(n) = λ(n) + \nu (n) - 1$; so if $n = 2↑k$,
$l↑B(n)/λ(n) = 1$, but if $n = 2↑{k+1} - 1$, $l↑B(n)/λ(n) =
2$.
\ansno 17. Let $i↓1 < \cdots < i↓t$. Delete any intervals $I↓k$
that can be removed without affecting the union $I↓1 ∪ \cdots
∪ I↓t$.\xskip (The interval $(j↓k, i↓k]$ may be dropped out if either
$j↓{k+1} ≤ j↓k$ or $j↓1 < j↓2 < \cdots$ and $j↓{k+1}
≤ i↓{k-1}$.)\xskip Now combine overlapping intervals $(j↓1, i↓1]$,
$\ldotss$, $(j↓d, i↓d]$ into an interval $(j↑\prime , i↑\prime
] = (j↓1, i↓d]$ and note that
$$a↓{i↑\prime}<a↓{j↑\prime}(1 + \delta )↑{i↓1-j↓1+\cdots+i↓d-j↓d}
≤ a↓{j↑\prime}(1 + \delta )↑{2(i↑\prime-j↑\prime)},$$
since each point of $(j↑\prime , i↑\prime ]$ is
covered at most twice in $(j↓i, i↓1] ∪ \cdots ∪ (j↓d, i↓d]$.
\ansno 18. Call $f(m)$ a ``nice'' function if
$\biglp\log f(m)\bigrp /m → 0$ as $m→∞$. A
polynomial in↔$m$ is nice. The product of nice functions is
nice. If $g(m) → 0$ and $c$ is a positive constant, then $c↑{mg(m)}$
is nice; also ${2m\choose mg(m)}$ is nice, for by Stirling's
approximation this is equivalent to saying that $g(m)\log\biglp
1/g(m)\bigrp → 0$.
Now replace each term of the summation
by the maximum term that is attained for any $s$, $t$, $v$. The
total number of terms is nice, and so are ${m+s\choose t+v}$,
${t+v\choose v} ≤ 2↑{t+v}$, and $β↑{2v}$, because $(t +
v)/m → 0$. Finally, ${(m+s)↑2\choose t} ≤ (2m)↑{2t}\!/t!
< (4m↑2\!/t)↑te↑t$, where $(4e)↑t$ is nice; setting $t$ to its
maximum value $(1 - {1\over 2}ε)m/λ(m)$, we have the upper bound $(m↑2\!/t)↑t
= \biglp mλ(m)/(1 - {1\over 2}ε)\bigrp ↑t = 2↑{m(1-ε/2)}
\cdot f(m)$, where $f(m)$ is nice. Hence the entire sum is less
than $α↑m$ for large $m$, if $α = 2↑{1-\eta}$, $0 < \eta < {1\over
2}ε$.
\ansno 19. (a)\9 $M ∩ N$, $M ∪ N$, $M \uplus N$, respectively;
see Eqs.\ 4.5.2--6, 4.5.2--7.
(b)\9 $f(z)g(z)$,\xskip$\lcm\biglp f(z), g(z)\bigrp $,\xskip
$\gcd\biglp f(z), g(z)\bigrp $.\xskip (For the same reasons as (a),
because the monic irreducible polynomials over the complex numbers
are precisely the polynomials $z - \zeta$.)
(c)\9 \α{Commutative law}s $A \uplus B = B \uplus A$, $A ∪ B =
B ∪ A$, $A ∩ B = B ∩ A$. \α{Associative law}s $A \uplus (B \uplus C) = (A
\uplus B) \uplus C$, $A ∪ (B ∪ C) = (A ∪ B) ∪ C$, $A ∩ (B ∩ C) = (A ∩ B)
∩ C$. \α{Distributive laws} $A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)$, $A
∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)$, $A \uplus (B ∪ C) = (A \uplus B) ∪ (A
\uplus C)$, $A \uplus (B ∩ C) = (A \uplus B) ∩ (A \uplus C)$. \α{Idempotent} laws
$A ∪ A = A$, $A∩A = A$. \α{Absorption laws} $A ∪ (A ∩ B) = A$, $A ∩ (A
∪ B) = A$, $A ∩ (A \uplus B) = A$, $A ∪ (A \uplus B) = A \uplus B$. Identity
and zero laws $\emptyset \uplus A = A$, $\emptyset ∪A=A$, $\emptyset∩A=\emptyset$,
where $\emptyset$ is the empty
multiset. \α{Counting law} $A \uplus B = (A ∪ B) \uplus (A ∩ B)$. Further
properties analogous to those of sets come from the \α{partial
ordering} defined by the rule $A \subset B$ iff $A ∩ B = A$ (iff
$A ∪ B = B$).
{\sl Notes:} Other common applications of multisets
are zeros and poles of meromorphic functions, invariants of
matrices in canonical form, invariants of finite Abelian groups,
etc.; multisets can be useful in combinatorial counting arguments
and in the development of measure theory. The terminal strings
of a noncircular \α{context-free grammar} form a multiset that
is a set if and only if the grammar is unambiguous. Although
\α{multiset}s appear frequently in mathematics, they often must
be treated rather clumsily because there is currently no standard
way to treat sets with repeated elements. Several mathematicians
have voiced their belief that the lack of adequate terminology
and notation for this common concept has been a definite handicap
to the development of mathematics.\xskip(A multiset is, of course,
formally equivalent to a mapping from a set into the nonnegative
integers, but this formal equivalence is of little or no practical
value for creative mathematical reasoning.)\xskip The author has discussed
this matter with many people in an attempt to find a good remedy.
Some of the names suggested for the concept were list, bunch, \α{bag},
heap, sample, weighted set, collection; but these words either
conflict with present terminology, have an improper connotation,
or are too much of a mouthful to say and to write conveniently.
It does not seem out of place to coin a new word for such an
important concept, and ``multiset'' has been suggested by N.
G. \α{de Bruijn}. The notation ``$A \uplus B$'' has been selected by
the author to avoid conflict with existing notations and to
stress the analogy with set union. It would not be as desirable
to use ``$A + B$'' for this purpose, since algebraists have
found that $A + B$ is a good notation for $\leftset α + β\relv α \in A\hbox{ and }
β \in B\rightset$. If $A$ is a multiset of nonnegative integers,
let $G(z) = \sum ↓{n\in A} z↑n$ be a \α{generating function} corresponding
to↔$A$.\xskip (Generating functions with nonnegative integer coefficients
obviously correspond one-to-one with multisets of nonnegative
integers.) If $G(z)$ corresponds to $A$ and $H(z)$ to $B$, then
$G(z) + H(z)$ corresponds to $A \uplus B$ and $G(z)H(z)$ corresponds
to $A + B$. If we form ``\α{Dirichlet}'' generating functions $g(z)
= \sum ↓{n\in A} 1/n↑z$, $h(z) = \sum ↓{n\in B} 1/n↑z$, then the product
$g(z)h(z)$ corresponds to the multiset product $AB$.
\ansno 20. Type 3: $(S↓0, \ldotss , S↓r) = (M↓{00}, \ldotss ,
M↓{r0}) = (\{0\}$, $\ldotss$, $\{A\}$, $\{A - 1, A\}$, $\{A - 1, A,
A\}$, $\{A - 1, A - 1, A, A, A\}$, $\ldotss$, $\{A + C - 3, A + C-3,A+C
- 2, A + C - 2, A + C - 2\})$.\xskip Type 5: $(M↓{00}, \ldotss , M↓{r0})
= (\{0\}$, $\ldotss$, $\{A\}$, $\{A - 1, A\}$, $\ldotss$, $\{A + C - 1,
A + C\}$, $\{A + C - 1, A + C - 1, A + C\}$, $\ldotss$, $\{A + C +
D - 1, A + C + D - 1, A + C + D\})$; $(M↓{01}, \ldotss , M↓{r1})
=(\emptyset$, $\ldotss$,$\emptyset$, $\emptyset$,
$\ldotss$, $\emptyset$, $\{A+C-2\}$, $\ldotss$,
$\{A + C + D - 2\})$, $S↓i = M↓{i0} \uplus M↓{i1}$.
\ansno 21. For example, let $u = 2↑{8q+5}$, $x = (2↑{(q+1)u} -
1)/(2↑u - 1) = 2↑{qu} + \cdots + 2↑u + 1$, $y = 2↑{(q+1)u} + 1$.
Then $xy = (2↑{2(q+1)u} - 1)/(2↑u - 1)$. If $n = 2↑{4(q+1)u}
+ xy$, we have $l(n) ≤ 4(q + 1)u + q + 2$ by Theorem↔F\null, but
$l↑\ast(n) = 4(q + 1)u + 2q + 2$ by Theorem↔H.
\ansno 22. Underline everything except the $u - 1$ insertions
used in the calculation of $x$.
\ansno 23. Theorem G (everything underlined).
\ansno 24. Use the numbers $(B↑{a↓i} - 1)/(B - 1)$, $0 ≤
i ≤ r$, underlined when $a↓i$ is underlined; and $c↓kB↑{i-1}(B↑{b↓j}
- 1)/(B - 1)$ for $0 ≤ j < t$, $0 < i ≤ b↓{j+1}-b↓j$, $1≤k ≤ l↑0(B)$, underlined
when $c↓k$ is underlined, where $c↓0$, $c↓1$, $\ldots$ is a minimum
length $l↑0$-chain for↔$B$. To prove the second inequality,
let $B = 2↑m$ and use (3).\xskip (The second inequality is rarely,
if ever, an improvement on Theorem↔G.)
%folio 824 galley 6a (C) Addison-Wesley 1978 *
\ansno 25. We may assume that $d↓k = 1$. Use the rule R $A↓{k-1}
\ldotsm A↓1$, where $A↓j = \null$``XR'' if ${d↓j = 1}$, $A↓j = \null$``R''
otherwise, and where ``R'' means take the square root, ``X''
means multiply by $x$. For example, if $y = (.1101101)↓2$, the
rule is R R XR XR R XR XR.\xskip (There exist binary square-root extraction
algorithms suitable for computer \α{hardware}, requiring an execution
time comparable to that of division; computers with such hardware
could therefore calculate more general fractional powers using the
technique in this exercise.)
\ansno 26. If we know the pair $(F↓k, F↓{k-1})$, then we have $(F↓{k+1},
F↓k) = (F↓k + F↓{k-1}, F↓k)$ and $(F↓{2k}, F↓{2k-1}) = (F↑{2}↓{k}
+ 2F↓kF↓{k-1}, F↑{2}↓{k} + F↑{2}↓{k-1})$; so a binary method
can be used to calculate $(F↓n, F↓{n-1})$, using $O(\log n)$
arithmetic operations. Perhaps better is to use the pair of
values $(F↓k, L↓k)$, where $L↓k = F↓{k-1} + F↓{k+1}$ (cf.\ Section
4.5.4); then we have $(F↓{k+1}, L↓{k+1}) = \biglp {1\over 2}(F↓k + L↓k),
{1\over 2}(5F↓k + L↓k)\bigrp$, $(F↓{2k}, L↓{2k}) = \biglp F↓kL↓k, L↑{2}↓{k}
- 2(-1)↑k\bigrp $.
For the general \α{linear recurrence} $x↓n = a↓1x↓{n-1}
+ \cdots + a↓dx↓{n-d}$, we can compute $x↓n$ in $O(d↑3\log
n)$ arithmetic operations by computing the $n$th power of an
appropriate $d\times d$ matrix.\xskip [This observation is due to J. C.
P. \α{Miller} and D. J. S. \α{Brown}, {\sl Comp.\ J.} {\bf 9} (1966),
188--190.]
\ansno 27. First form the $2↑m - m - 1$ products $x↑{e↓1}↓{1}
\ldotss x↑{e↓m}↓{m}$, for all sequences of exponents such that
$0 ≤ e↓j ≤ 1$ and $e↓1 + \cdots +
e↓m ≥ 2$. Let $n↓j = (d↓{jλ} \ldotsm d↓{j1}d↓{j0})↓2$; to complete the
calculation, take $x↑{d↓{1λ}}↓{1} \ldotsm x↑{d↓{mλ}}↓{m}$,
then square and multiply by $x↑{d↓{1i}}↓{1} \ldotss x↑{d↓{mi}}↓{m}$,
for $i = λ - 1$, $\ldotss$, 1, 0.\xskip[Straus showed in
{\sl AMM \bf 71} (1964), 807--808, that $2λ(n)$ may be replaced by
$(1 + ε)λ(n)$ for any $ε > 0$, by generalizing this binary method
to $2↑k$-ary as in Theorem↔D\null. See exercise↔39 for later developments.]
\def\\{\mathbin{\char'562}}
\ansno 28. (a)\9 $x \\ y = x ∨ y ∨ (x + y)$, where
``$∨$'' is \α{logical ``or''}, cf.\ exercise 4.6.2--26; clearly $\nu
(x \\ y) ≤ \nu (x ∨ y) + \nu (x ∧ y) = \nu (x) + \nu (y)$.\xskip (b)
Note first that $A↓{i-1}/2↑{d↓{i-1}}\subset
A↓i/2↑{d↓i}$ for $1 ≤ i ≤ r$. Secondly, note that $d↓j = d↓{i-1}$ in
a nondoubling; for otherwise $a↓{i-1} ≥ 2a↓j ≥ a↓j + a↓k = a↓i$.
Hence $A↓j \subset A↓{i-1}$ and $A↓k \subset A↓{i-1}/2↑{d↓j-d↓k}
$.\xskip (c) An easy induction on $i$, except that
close steps need closer attention. Let us say that $m$ has property
$P(α)$ if the 1's in its binary representation all appear in
consecutive blocks of $≥α$ in a row. If $m$ and $m↑\prime$ have
$P(α)$, so does $m \\ m↑\prime $; if $m$ has $P(α)$ then $\rho
(m)$ has $P(α + \delta )$. Hence $B↓i$ has $P(1 + \delta c↓i)$.
Finally if $m$ has $P(α)$ then $\nu \biglp \rho (m)\bigrp ≤ (α +
\delta )\nu (m)/α$; for $\nu (m) = \nu ↓1 + \cdots + \nu ↓q$,
where each block size $\nu ↓j$ is $≥α$, hence $\nu \biglp \rho
(m)\bigrp ≤ (\nu ↓1 + \delta ) + \cdots + (\nu ↓q + \delta) ≤ (1
+ \delta /α)\nu ↓1 + \cdots + (1 + \delta /α)\nu ↓q$.\xskip (d) Let
$f = b↓r + c↓r$ be the number of nondoublings and $s$ the number
of small steps. If $f ≥ 3.271 \lg \nu (n)$ we have $s ≥ \lg
\nu (n)$ as desired, by (16). Otherwise we have $a↓i ≤ (1 +
2↑{-\delta})↑{b↓i}2↑{c↓i+d↓i}$ for $0 ≤ i ≤
r$, hence $n ≤ \biglp (1 + 2↑{-\delta})/2\bigrp ↑{b↓r}2↑r$,
and $r ≥ \lg n + b↓r - b↓r \lg(1 + 2↑{-\delta}) ≥ \lg n + \lg
\nu (n) - \lg(1 + \delta c↓r) - b↓r \lg(1 + 2↑{-\delta})$. Let
$\delta = \lceil \lg(f + 1)\rceil $; then $\ln(1 + 2↑{-\delta})
≤ \ln\biglp 1 + 1/(f + 1)\bigrp ≤ 1/(f + 1) ≤ \delta /(1 +
\delta f)$, and it follows that $\lg(1 + \delta x) + (f - x)\lg(1
+ 2↑{-\delta}) ≤ \lg(1 + \delta f)$ for $0 ≤ x ≤ f$. Hence finally
$l(n) ≥ \lg n + \lg \nu(n) - \lg\biglp 1 + (3.271\lg \nu
(n)) \lceil \lg(1 + 3.271 \lg \nu (n))\rceil\bigrp$.\xskip
[{\sl Theoretical Comp.\ Sci.\ \bf 1} (1975),
1--12.]
\ansno 29. In the paper just cited, \β{Sch\"onhage}
refined the method of exercise↔28 to prove that $l(n)
≥ \lg n + \lg \nu (n) - 2.13$ for all $n$. Can the remaining
gap be closed?
\ansno 30. $n = 31$ is the smallest example;
$l(31) = 7$, but 1, 2, 4, 8, 16, 32, 31 is an addition-subtraction
chain of length 6.\xskip
[After proving Theorem E\null, \α{Erd\H os} stated that the
same result holds
also for addition-subtraction chains. Sch\"onhage has
extended the lower bound of exercise↔28 to addition-subtraction
chains, with $\nu (n)$ replaced by $\=\nu (n) =\null$minimum
number of nonzero digits to represent $n = (n↓q \ldotsm n↓0)↓2$
where each $n↓j$ is $-1$, 0, or $+1$. This quantity $\=\nu(n)$ is the number
\inx{negative digits}
of 1's, in the ordinary binary representation of $n$, that
are immediately preceded by 0 or by the string $00(10)↑k1$ for
some $k ≥ 0$.]
\ansno 32. First compute $2↑i$ for $1≤i≤λ(n↓m)$, then compute each $n=n↓j$
by the following variant of the $2↑k$-ary method: For all odd $i<2↑k$, compute
$f↓i=\sum\leftset 2↑{kt+e}\relv d↓t=2↑ei\rightset$ where $n=(\ldotsm d↓1d↓0)↓{2↑k}$,
in at most $\lfloor{1\over k}\lg n\rfloor$ steps; then compute $n=\sum if↓i$ in
at most $\sum l(i)+2↑{k-1}$ further steps. The number of steps per $n↓j$ is
$≤\lfloor{1\over k}\lg n\rfloor+O(k2↑k)$, and this is $λ(n)/λλ(n)+O\biglp
λ(n)λλλ(n)/λλ(n)↑2\bigrp$ when $k=\lfloor\lg\lg n-3\lg\lg\lg n\rfloor$.
[A generalization of Theorem↔E gives the corresponding lower bound.\xskip
Reference: {\sl SIAM J. Computing \bf5} (1976), 100--103. See also exercise↔39.]
\ansno 33. The following construction due to D. J. \α{Newman} provides the best
upper bound currently known: Let $k=p↓1\ldotsm p↓r$ be the product of the first
$r$ primes. Compute $k$ and all quadratic residues mod $k$ by the method of
exercise↔32, in $O(2↑{-r}k\log k)$ steps (because there are approximately $2↑{-r}k$
\α{quadratic residues}). Also compute all multiples of $k$ that are $≤m↑2$, in about
$m↑2\!/k$ further steps. Now $m$ additions suffice to compute $1↑2$, $2↑2$,
$\ldotss$, $m↑2$. We have
$k=\exp\biglp p↓r+O\biglp p↓r/(\log p↓r)↑{1000}\bigrp\bigrp$ where $p↓r$ is
given by the formula in exercise 4.5.4--35; so by choosing
$$\textstyle r=\hbox{\:u\char'142}\biglp1+(1+
{1\over2}\ln2)/\!\lg\lg m\bigrp\ln m/\!\ln\ln m\hbox{\:u\char'143}$$ it follows
that $l(1↑2,\ldotss,m↑2)=m+O\biglp m\cdot\exp(-{1\over2}\ln2\ln m/\!\ln\ln m)
\bigrp$.
On the other hand, D. \α{Dobkin} and R. \α{Lipton} have shown that, for any $ε>0$,
$l(1↑2,\ldotss,m↑2)>m+m↑{2/3-ε}$ when $m$ is sufficiently large [{\sl SIAM J.
Computing \bf9} (1980), 121--125].
\ansno 35. See {\sl Discrete Math.\ \bf23} (1978), 115--119.
\ansno 36. Eight; there are four ways to compute $39=12+12+12+3$ and two ways to
compute $79=39+39+1$.
\ansno 37. The statement is true. The labels in the reduced graph of the binary
chain are $\lfloor n/2↑k\rfloor$ for $k=e↓0$, $\ldotss$,↔0; they are 1, 2, $\ldotss$,
$2↑{e↓0}$, $n$ in the dual graph.\xskip[Similarly, the right-to-left $m$-ary method
of exercise↔9 is the dual of the left-to-right method.]
\ansno 38. $2↑t$ are equivalent to the binary chain; it would be $2↑{t-1}$ if
$e↓0=e↓1+1$. The number of chains equivalent to the scheme of Algorithm↔A is the
number of ways to compute the sum of $t+2$ numbers of which two are identical.
This is ${1\over2}f↓{t+1}+{1\over2}f↓t$, where $f↓m$ is the number of ways to
compute the sum of $m+1$ distinct numbers. When we take \α{commutativity} into
account, we see that $f↓m$ is $2↑{-m}$ times $(m+1)!$ times the number of
\α{binary trees} on $m$ nodes,
\inx{Catalan numbers}
so $f↓m=(2m-1)(2m-3)\ldotsm1$.
\ansno 39. The quantity $l\biglp[n↓1,n↓2,\ldotss,n↓m]\bigrp$ is the minimum of
$\hbox{arcs}-\hbox{vertices}+m$ taken over all directed graphs having $m$
vertices $s↓j$ whose in-degree is zero and one vertex↔$t$ whose out-degree is
zero, where there are exactly $n↓j$ oriented paths from $s↓j$ to $t$ for ${1≤j≤m}$.
The quantity $l(n↓1,n↓2,\ldotss,n↓m)$ is the minimum of $\hbox{arcs}-\hbox{vertices}
+1$ taken over all directed graphs having one vertex $s$ whose in-degree is zero
and $m$ vertices $t↓j$ whose out-degree is zero, where there are exactly $n↓j$
oriented paths from $s$ to $t↓j$ for ${1≤j≤m}$. These problems are \α{dual} to each
other, if we change the direction of all the arcs.
{\sl Note:} C. H. \α{Papadimitriou} points out that this is a special case of a
much more general theorem. Let $N=(n↓{ij})$ be an $m\times p$ matrix of nonnegative
integers having no row or column entirely zero. We can define $l(N)$ to be the
minimum number of multiplications needed to compute the set of \α{monomial}s
$\leftset x↓1↑{n↓{1j}}\ldotss x↓m↑{n↓{mj}}\relv 1≤j≤p\rightset$. Now $l(N)$ is
also the minimum of $\hbox{arcs}-\hbox{vertices}+m$ taken over all directed
graphs having $m$ vertices $s↓i$ whose in-degree is zero and $p$ vertices $t↓j$
whose out-degree is zero, where there are exactly $n↓{ij}$ oriented paths from
$s↓i$ to $t↓j$ for each $i$ and $j$. By duality we have $l(N)=l(N↑T)+m-p$.
N. \α{Pippenger} has proved a comprehensive generalization of the results of
exercises 27 and↔32. Let $L(m,p,n)$ be the maximum of $l(N)$ taken over all
$m\times p$↔matrices↔$N$ of nonnegative integers $n↓{ij}≤n$. Then $L(m,p,n)=
\min(m,p)\lg n+H/\!\lg H+{O\biglp m+p+H(\log\log H)↑{1/2}(\log H)↑{-3/2}\bigrp}$,
where $H=mp\lg(n+1)$.\xskip[{\sl SIAM J. Com\-puting \bf9} (1980), 230--250.]
\ansno 40. By exercise 39, it suffices to show that $l(m↓1n↓1+\cdots+m↓tn↓t)≤
l(m↓1,\ldotss,m↓t)+l\biglp[n↓1,\ldotss,n↓t]\bigrp$. But this is clear, since
we can first form $\{x↑{m↓1},\ldotss,x↑{m↓t}\}$ and then compute the monomial
$(x↑{m↓1})↑{n↓1}\ldotss(x↑{m↓t})↑{n↓t}$.
{\sl Notes:} Theorem F is a special case of this result, since we clearly
have $l(2↑m,x)
≤m+\nu(x)-1$ when $λ(x)≤m$. One strong way to state Olivos's theorem is that if
$a↓0$, $\ldotss$, $a↓r$ and $b↓0$, $\ldotss$, $b↓s$ are any addition chains,
then $l\biglp\sum c↓{ij}a↓ib↓j\bigrp≤{r+s+\sum c↓{ij}-1}$ for any $(r+1)\times(s+1)$
matrix of nonnegative integers↔$c↓{ij}$.
\ansno 41. [To appear.]\xskip
The stated formula can be proved whenever $A≥9m↑2$. Since this is a
polynomial in $m$, and since the problem of finding a minimum vertex cover is
NP↔hard (cf.\ Section↔7.9),
the problem of computing $l(n↓1,\ldotss,n↓m)$ is \α{NP complete}.\xskip
[It is unknown whether or not the problem of computing
$l(n)$ is NP↔complete.]
%folio 826 galley 6b (C) Addison-Wesley 1978 *
\ansbegin{4.6.4}
\ansno 1. Set $y ← x↑2$, then compute
$\biglp (\ldotsm(u↓{2n+1}y + u↓{2n-1})y + \cdotss)y+u↓1\bigrp x$.
\ansno 2. Replacing $x$ in (2) by the polynomial $x + x↓0$ leads
to the following procedure:
\yyskip\hangindent 40pt\noindent\hbox to 40pt{\hfill\bf G1. }
Do step G2 for $k = n$, $n - 1$, $\ldotss$,
0 (in this order), and stop.
\yskip\hangindent 40pt\noindent\hbox to 40pt{\hfill\bf G2. }
Set $v↓k ← u↓k$, and then set
$v↓j ← v↓j + x↓0v↓{j+1}$ for $j = k$, $k + 1$, $\ldotss$, $n - 1$.\xskip
(When $k = n$, this step simply sets $v↓n ← u↓n$.)\quad\blackslug
\yyskip\noindent The computations turn out to be
identical to those in H1 and H2, but performed in a different order.
$\biglp$This
application was, in fact, \α{Newton}'s original motivation for using
scheme↔(2).$\bigrp$
\ansno 3. The coefficient of $x↑k$ is a polynomial in $y$ that
may be evaluated by Horner's rule: $\biglp \ldotsm(u↓{n,0}x
+ (u↓{n-1,1}y + u↓{n-1,0}))x + \cdotss\bigrp x + \biglp (\ldotsm
(u↓{0,n}y + u↓{0,n-1})y + \cdotss)y + u↓{0,0}\bigrp $.\xskip [For
\inx{homogeneous polynomial}
a ``homogeneous'' polynomial, such as $u↓nx↑n + u↓{n-1}x↑{n-1}y
+ \cdots + u↓1xy↑{n-1} + u↓0y↑n$, another scheme is more efficient:
if $0<|x|≤|y|$, first divide $x$ by $y$, evaluate a polynomial in $x/y$, then
multiply by $y↑n$.]
\ansno 4. Rule (2) involves $4n$ or $3n$ real multiplications
and $4n$ or $7n$ real additions; (3) is worse, it takes 4$n
+ 2$ or $4n + 1$ mults, $4n + 2$ or $4n + 5$ adds.
\ansno 5. One multiplication to compute $x↑2$; $\lfloor
n/2\rfloor$ multiplications and $\lfloor n/2\rfloor$ additions
to evaluate the first line; $\lceil n/2\rceil$ multiplications
and $\lceil n/2\rceil - 1$ additions to evaluate the second
line; and one addition to add the two lines together. Total:
$n + 1$ multiplications and $n$ additions.
\ansno 6. \hbox to 21pt{\hfill\bf J1. }Compute and store the values $x↑{2}↓{0}$,
$\ldotss$, $x↑{\lceil n/2\rceil}↓{0}$.
\yskip\noindent\hbox to 40pt{\hfill\bf J2. }Set $v↓j ← u↓jx↑{j-\lfloor n/2\rfloor
}↓{0}$ for $0 ≤ j ≤ n$.
\yskip\noindent\hbox to 40pt{\hfill\bf J3. }For $k = 0$, 1, $\ldotss$, $n - 1$,
set $v↓j ← v↓j + v↓{j+1}$ for $j = n - 1$, $\ldotss$, $k + 1$, $k$.
\yskip\noindent\hbox to 40pt{\hfill\bf J4. }Set $v↓j ← v↓jx↑{\lfloor n/2\rfloor
-j}↓{0}$ for $0 ≤ j ≤ n$.\quad\blackslug
\yyskip\noindent There are $(n↑2 + n)/2$ additions, $n + \lceil
n/2\rceil - 1$ multiplications, $n$ divisions. Another multiplication
and division can be saved by treating $v↓n$ and $v↓0$ as special
cases.\xskip {\sl Reference: SIGACT News \bf 7}, 3 (Summer 1975),
32--34.
\ansno 7. Let $x↓j = x↓0 + jh$, and consider (42), (44). Set
$y↓j ← u(x↓j)$ for $0 ≤ j ≤ n$. For $k = 1$, 2, $\ldotss$, $n$ (in
this order), set $y↓j ← y↓j - y↓{j-1}$ for $j = k$, $k + 1$, $\ldotss
$, $n$ (in this order). Now $β↓j = y↓j$ for all $j$.
\ansno 8. See (43).
\ansno 9. [{\sl Combinatorial Mathematics} (Buffalo: Math.\ Assoc.\ of
America, 1963), 26--28.]\xskip This formula can be regarded as
an application of the principle of inclusion and exclusion (Section
1.3.3), since the sum of the terms for $n - ε↓1 - \cdots - ε↓n
= k$ is the sum of all $x↓{1j↓1}x↓{2j↓2}\ldotsm
x↓{nj↓n}$ for which $k$ values of the $j↓i$ do not appear.
A direct proof can be given by observing that the coefficient
of $x↓{1j↓1}\ldotsm x↓{nj↓n}$ is
$$\sum (-1)↑{n-ε↓1-\cdots-ε↓n\,}ε↓{j↓1}\ldotsm ε↓{j↓n};$$
if the $j$'s are distinct, this equals unity, but
if $j↓1$, $\ldotss$, $j↓n ≠ k$ then it is zero, since the terms
for $ε↓k = 0$ cancel the terms for $ε↓k = 1$.
To evaluate the sum efficiently, we can start
with $ε↓1 = 1$, $ε↓2 = \cdots = ε↓n = 0$, and we can then proceed
through all combinations of the $ε$'s in such a way that only
one $ε$ changes from one term to the next.\xskip(See ``\α{Gray code}''
in Chapter↔7.)\xskip The work to compute the first term is $n - 1$
multiplications; the subsequent $2↑n - 2$ terms each involve
$n$ additions, then $n - 1$ multiplications, then one more addition.
Total: $(2↑n - 1)(n - 1)$ multiplications, and $(2↑n - 2)(n
+ 1)$ additions. Only $n + 1$ temp storage locations are needed,
one for the main partial sum and one for each factor of the
current product.
%folio 828 galley 7 (C) Addison-Wesley 1978 *
\ansno 10. $\sum ↓{1≤k<n} (k + 1){n\choose k+1} = n(2↑{n-1}
- 1)$ multiplications and $\sum ↓{1≤k<n} k{n\choose k+1} =
{n2↑{n-1} - 2↑n} + 1$ additions. This is approximately half as
many arithmetic operations as the method of exercise↔9, although
it requires a more complicated program to control the sequence.
Approximately ${n\choose \lceil n/2\rceil } + {n\choose \lceil
n/2\rceil -1}$ temporary storage locations must be used, and
this grows exponentially large (on the order of $2↑n/\sqrt{n}\,$).
The method in this exercise is equivalent to the
unusual matrix factorization of the permanent function given
by \α{Jurkat} and \α{Ryser} in {\sl J. Algebra \bf 5} (1967), 342--357.
It may also be regarded as an application of (39) and (40), in
an appropriate sense.
\ansno 12. Here is a brief summary of progress on this famous research
\inxf{matrix multiplication}
problem: J. \α{Hopcroft} and L. R. \α{Kerr} proved, among other
things, that seven multiplications are necessary in $2 \times 2$ matrix
multiplication [{\sl SIAM J. Appl.\ Math.\ \bf 20} (1971), 30--36].
R. L. \α{Probert} showed that all 7-multiplication schemes, in
which each multiplication takes a linear combination of elements
from one matrix and multiplies by a linear combination of elements
from the other, must have at least 15 additions [{\sl SIAM J.
Computing \bf5} (1976), 187--203]. For $n=3$, the best method known is due to
J. D. \α{Laderman} [{\sl Bull.\ Amer.\ Math.\ Soc.\ \bf 82}
(1976), 126--128], who showed that 23 noncommutative multiplications
suffice. His construction has been generalized by Ondrej \α{S\'ykora,} who exhibited a
method requiring $n↑3-(n-1)↑2$ noncommutative multiplications and
$n↑3-n↑2+11(n-1)↑2$\linebreak
additions, a result that also reduces to (36) when $n=2$
[{\sl Lecture Notes in Comp.\ Sci.\ \bf53} (1977), 504--512].
The best lower bound known to hold for all $n$ is the fact that
$2n↑2-1$ nonscalar multiplications are necessary [Jean-Paul \α{Lafon} and S. \α{Winograd},
{\sl Theoretical Comp.\ Sci.}, to appear].
\α{Pan} has generalized this to a lower bound of $mn+ns+m-n-1$ in the $m\times
n\times s$ case [see {\sl SIAM J. Computing \bf9} (1980), 341].
The best upper bounds known for large
$n$ were changing rapidly as this book was being prepared for publication;
see exercises 60--64.
\ansno 13. By summing geometric series, we find that $F(t↓1, \ldotss , t↓n)$ equals
$$\textstyle\sum ↓{0≤s↓1<m↓1,\ldots,0≤s↓n<m↓n}
\exp\biglp -2πi(s↓1t↓1/m↓1 + \cdots + s↓nt↓n/m↓n)f(s↓1, \ldotss
, s↓n)\bigrp /m↓1 \ldotsm m↓n.$$
The inverse transform times $m↓1 \ldotsm m↓n$ can be found by
doing a regular transform and interchanging $t↓j$ with $m↓j
- t↓j$ when $t↓j ≠ 0$; cf.\ exercise 4.3.3--9.
[If we regard $F(t↓1, \ldotss , t↓n)$
as the coefficient of $x↑{t↓1}↓{1} \ldotss x↑{t↓n}↓{n}$ in a
multivariate polynomial, the finite Fourier transform amounts
to evaluation of this polynomial at roots of unity, and the
inverse transform amounts to finding the \α{interpolating} polynomial.]
\ansno 14. Let $m↓1=\cdots=m↓n=2$, $F(t↓1, t↓2, \ldotss , t↓n) = F(2↑{n-1}t↓n
+ \cdots + 2t↓2 + t↓1)$, and $f(s↓1, s↓2, \ldotss , s↓n) = f(2↑{n-1}s↓1
+ 2↑{n-2}s↓2 + \cdots + s↓n)$; note the reversed treatment between
$t$'s and $s$'s. Also let $g↓k(s↓k, \ldotss , s↓n, t↓k)$ be $\omega$
raised to the $2↑{k-1}t↓k(s↓n + 2s↓{n-1} + \cdots + 2↑{n-k}s↓k)$
power.
At each iteration we essentially take $2↑{n-1}$ pairs of complex numbers $(α,β)$ and
replace them by $(α+\zeta β,α-\zeta β)$, where $\zeta$ is a suitable power of
$\omega$, hence $\zeta=\cos\theta+i\sin\theta$ for some $\theta$.
If we take advantage of simplifications when $\zeta=\pm1$ or $\pm i$, the total
work comes to $\biglp(n-3)\cdot 2↑{n-1}+2\bigrp$ complex multiplications and $n\cdot
2↑n$ complex additions; the techniques of exercise↔41 can be used to reduce the
real multiplications and additions used to implement these complex operations.
The number of complex multiplications can be reduced about 25 per cent without
changing the number of additions by combining passes $k$ and $k+1$ for $k=1$, 3,
$\ldotss$; this means that $2↑{n-2}$ quadruples $(α,β,\gamma,\delta)$ are being
replaced by $(α+\zetaβ+\zeta↑2\gamma+\zeta↑3\delta$, $α+i\zetaβ-\zeta↑2\gamma-
i\zeta↑3\delta$, $α-\zetaβ+\zeta↑2\gamma-\zeta↑3\delta$, $α-i\zetaβ-\zeta↑2\gamma
+i\zeta↑3\delta)$. The number of complex multiplications when $n$ is even is
thereby reduced to $(3n-2)2↑{n-3}-5\lfloor2↑{n-1}/3\rfloor$.
These calculations assume that the given numbers $F(t)$ are complex. If the $F(t)$
are real, then $f(s)$ is the complex conjugate of $f(2↑n-s)$, so we can avoid the
redundancy by computing only the $2↑n$ independent real numbers $f(0)$,
$\real f(1)$, $\ldotss$,↔$\real f({2↑{n-1}-1})$, $f(2↑{n-1})$, $\imag
f(1)$, $\ldotss$, $\imag f(2↑{n-1}-1)$. The entire calculation in this case can
be done by working with $2↑n$ real values, using the fact that $f↑{[k]}(s↓{n-k+1},
\ldotss,s↓n,t↓1,\ldotss,t↓{n-k})$ will be the complex conjugate of $f↑{[k]}(
s↓{n-k+1}↑\prime,\ldotss,s↓n↑\prime,t↓1,\ldotss,t↓{n-k})$ when $(s↓1\ldotsm s↓n)↓2
+(s↓1↑\prime\ldotsm s↓n↑\prime)↓2≡0\modulo{2↑n}$. About half as many multiplications
and additions are needed as in the complex case.
[The fast Fourier
transform algorithm is essentially due to C. \α{Runge} and H. \α{K\"onig}
in 1924, and it was generalized by J. W. \α{Cooley} and J.
W. \α{Tukey,} {\sl Math.\ Comp.\ \bf 19} (1965), 297--301. Its interesting
history has been traced by J. W. Cooley, P. A. W. \α{Lewis,} and P.
D. \α{Welch,} {\sl Proc.\ IEEE} {\bf 55} (1967), 1675--1677. Details
concerning its use have been discussed by R. C. \α{Singleton,} {\sl CACM
\bf 10} (1967), 647--654; M. C. \α{Pease,} {\sl JACM \bf 15} (1968),
252--264; G. D. \α{Berglund,} {\sl Math.\ Comp.\ \bf 22} (1968), 275--279,
{\sl CACM \bf 11} (1968), 703--710; A. M. \α{Macnaghten} and C. A. R. \α{Hoare,} {\sl Comp.\
J. \bf20} (1977), 78--83. See also exercises 53, 57, and↔59.]
\ansno 15. (a)\9 The hint follows by integration and induction.
Let $f↑{(n)}(\theta )$ take on all values between $A$ and $B$
inclusive, as $\theta$ varies from $\min(x↓0, \ldotss , x↓n)$
to $\max(x↓0, \ldotss , x↓n)$. Replacing $f↑{(n)}$ by each of these bounds,
in the stated integral, yields $A/n! ≤ f(x↓0, \ldotss , x↓n)
≤ B/n!$.\xskip (b)↔It suffices to prove this for $j = n$. Let $f$
be Newton's interpolation polynomial, then $f↑{(n)}$ is the
constant $n! α↓n$.
\ansno 16. Carry out the multiplications and additions of (43)
as operations on polynomials.\xskip (The special case $x↓0 = x↓1 =
\cdots = x↓n$ is considered in exercise↔2. We have used this
method in step C8 of Algorithm↔4.3.3C.)
\ansno 17. T. M. \α{Vari} has shown that $n - 1$ multiplications
are necessary, by proving that $n$ multiplications are necessary to compute
$x↑{2}↓{1} + \cdots + x↑{2}↓{n}$ [Cornell Computer Science Report
120 (Jan. 1972)].
\ansno 18. $α↓0 = {1\over 2}(u↓3/u↓4 + 1)$, $β = u↓2/u↓4 - α↓0(α↓0
- 1)$, $α↓1 = α↓0β - u↓1/u↓4$, $α↓2 = β - 2α↓1$, $α↓3 = u↓0/u↓4 -
α↓1(α↓1 + α↓2)$, $α↓4 = u↓4$.
\ansno 19. Since $α↓5$ is the leading coefficient, we may assume
without loss of generality that $u(x)$ is monic (i.e., $u↓5
= 1$). Then $α↓0$ is a root of the cubic equation ${40z↑3 - 24u↓4z↑2}
+ (4u↑{2}↓{4} + 2u↓3)z + (u↓2 - u↓3u↓4) = 0$; this equation
always has at least one real root, and it may have three. Once
$α↓0$ is determined, we have $α↓3 = u↓4 - 4α↓0$, $α↓1 = u↓3 -
4α↓0α↓3 - 6α↑{2}↓{0}$, $α↓2 = u↓1 - α↓0(α↓0α↓1 + 4α↑{2}↓{0}α↓3
+ 2α↓1α↓3 + α↑{3}↓{0})$, $α↓4 = u↓0 - α↓3(α↑{4}↓{0} + α↓1α↑{2}↓{0}
+ α↓2)$.
For the given polynomial we are to solve the cubic
equation $40z↑3 - 120z↑2 + 80z = 0$; this leads to three solutions
$(α↓0, α↓1, α↓2, α↓3, α↓4, α↓5) = (0, -10, 13, 5, -5, 1)$, $(1,
-20, 68, 1, 11, 1)$, $(2, -10, 13, -3, 27, 1)$.
\ansno 20. $\vtop{\halign{\tt#\hfill\quad⊗\tt#\hfill\qquad⊗#\hfill\cr
LDA⊗X\cr
FADD⊗=$α↓3$=\cr
STA⊗TEMP1\cr
FADD⊗=$α↓0$-$α↓3$=\cr
STA⊗TEMP2\cr
FMUL⊗TEMP2\cr
\lower6pt\null STA⊗TEMP2\cr}}\hskip50pt
\vtop{\halign{\tt#\hfill\quad⊗\tt#\hfill\qquad⊗#\hfill\cr
FADD⊗=$α↓1$=\cr
FMUL⊗TEMP2\cr
FADD⊗=$α↓2$=\cr
FMUL⊗TEMP1\cr
FADD⊗=$α↓4$=\cr
FMUL⊗=$α↓5$=⊗\quad\blackslug\cr}}$
\ansno 21. $z = (x + 1)x - 2$, $w = (x + 5)z + 9$, $u(x) =
(w + z - 8)w - 8$; or $z = (x + 9)x + 26$, $w = (x - 3)z + 73$,
$u(x) = (w + z - 24)w - 12$.
\ansno 22. $α↓6 = 1$, $α↓0 = -1$, $α↓1 = 1$, $β↓1 = -2$, $β↓2 = -2$,
$β↓3=-2$,
$β↓4 = 1$, $α↓3 = -4$, $α↓2 = 0$, $α↓4 = 4$, $α↓5 = -2$. We form $z =
(x - 1)x + 1$, $w = z + x$, and $u(x) = \biglp (z - x - 4)w +
4\bigrp z - 2$. In this
special case we see that one of the seven additions can be saved if we compute
$w=x↑2+1$, $z=w-x$.
\ansno 23. (a)\9 We may use induction on $n$; the result is trivial
if $n < 2$. If $f(0) = 0$, then the result is true for the polynomial
$f(z)/z$, so it holds for $f(z)$. If $f(iy) = 0$ for some real
$y ≠ 0$, then $g(\pm iy) = h(\pm iy) = 0$; since the result
is true for $f(z)/(z↑2 + y↑2)$, it holds also for $f(z)$. Therefore
we may assume that $f(z)$ has no roots whose real part is zero.
Now the net number of times the given path circles the origin
is the number of roots of $f(z)$ inside the region, which is
at most 1. When $R$ is large, the path $f(Re↑{it})$ for $π/2
≤ t ≤ 3π/2$ will circle the origin clockwise approximately $n/2$
times; so the path $f(it)$ for $-R ≤ t ≤ R$ must go counterclockwise
around the origin at least $n/2 - 1$ times. For $n$ even, this
implies that $f(it)$ crosses the imaginary axis at least $n
- 2$ times, and the real axis at least $n - 3$ times; for $n$
odd, $f(it)$ crosses the real axis at least $n - 2$ times and
the imaginary axis at least $n - 3$ times. These are roots respectively
of $g(it) = 0$, $h(it) = 0$.
(b)\9 If not, $g$ or $h$ would have a root of the
form $a + bi$ with $a ≠ 0$ and $b ≠ 0$. But this would imply
the existence of at least three other such roots, namely $a
- bi$ and $-a \pm bi$, while $g(z)$ and $h(z)$ have at most $n$ roots.
%folio 832 galley 8 (C) Addison-Wesley 1978 *
\ansno 24. The roots of $u$ are $-7$, $-3 \pm i$, $-2 \pm i$, and $-1$;
permissible values of $c$ are 2 and↔4 (but {\sl not\/} 3, since
$c = 3$ makes the sum of the roots equal to zero).\xskip Case 1, $c
= 2$: $p(x) = (x + 5)(x↑2 + 2x + 2)(x↑2 + 1)(x - 1) = x↑6 + 6x↑5
+ 6x↑4 + 4x↑3 - 5x↑2 - 2x - 10$; $q(x) = 6x↑2 + 4x - 2 = 6(x +
1)(x - {1\over 3})$. Let $α↓2 = -1$, $α↓1 = {1\over 3}$; $p↓1(x)
= x↑4 + 6x↑3 + 5x↑2 - 2x - 10 = (x↑2 + 6x + {16\over 3})(x↑2
- {1\over 3}) - {74\over 9}$; $α↓0 = 6$, $β↓0 = {16\over 3}$, $β↓1
= -{74\over 9}$.\xskip Case 2, $c = 4$: A similar analysis gives $α↓2
= 9$, $α↓1 = -3$, $α↓0 = -6$, $β↓0 = 12$, $β↓1 = -26$.
\ansno 25. $β↓1 = α↓2$, $β↓2 = 2α↓1$, $β↓3 = α↓7$, $β↓4 = α↓6$, $β↓5
= β↓6 = 0$, $β↓7 = α↓1$, $β↓8 = 0$, $β↓9 = 2α↓1 - α↓8$.
\ansno 26. (a)\9 $λ↓1 = α↓1 \times λ↓0$, $λ↓2 = α↓2 + λ↓1$, $λ↓3
= λ↓2 \times λ↓0$, $λ↓4 = α↓3 + λ↓3$, $λ↓5 = λ↓4 \times λ↓0$, $λ↓6
= α↓4 + λ↓5$.\xskip (b) $\kappa ↓1 = 1 + β↓1x$, $\kappa ↓2 = 1 + β↓2\kappa
↓1x$, $\kappa ↓3 = 1 + β↓3\kappa ↓2x$, $u(x) = β↓4\kappa ↓3 = β↓1β↓2β↓3β↓4x↑3
+ β↓2β↓3β↓4x↑2 + β↓3β↓4x + β↓4$.\xskip (c) If any coefficient is zero,
the coefficient of $x↑3$ must also be zero in (b), while (a)
yields an arbitrary polynomial $α↓1x↑3 + α↓2x↑2 + α↓3x + α↓4$
of degree $≤3$.
\ansno 27. Otherwise there would be a nonzero polynomial $f(q↓n,
\ldotss , q↓1, q↓0)$ with integer coeffi\-cients such that $q↓n
\cdot f(q↓n, \ldotss , q↓1, q↓0) = 0$ for all sets $(q↓n, \ldotss
, q↓0)$ of real numbers. This cannot happen, since it is easy
to prove by induction on $n$ that a nonzero polynomial always
takes on some nonzero value.\xskip (Cf.\ exercise 4.6.1--16.
However, this result is false
for {\sl finite} fields in place of the real numbers.)
\ansno 28. The indeterminate quantities $α↓1$, $\ldotss$, $α↓s$
form an algebraic basis for the polynomial domain $Q[α↓1, \ldotss , α↓s]$, where $Q$
is the field of rational numbers. Since $s + 1$ is greater than
the number of elements in a basis, the polynomials $f↓j(α↓1,
\ldotss , α↓s)$ are algebraically dependent; this means
that there is a nonzero polynomial $g$ with rational coefficients
such that $g\biglp f↓0(α↓1, \ldotss , α↓s), \ldotss , f↓s(α↓1,
\ldotss , α↓s)\bigrp$ is identically zero.
\ansno 29. Given $j↓0$, $\ldotss$, $j↓t \in \{0, 1, \ldotss , n\}$,
there are nonzero polynomials with integer coefficients such
that $g↓j(q↓{j↓0}, \ldotss , q↓{j↓t}) = 0$ for all
$(q↓n, \ldotss , q↓0)$ in $R↓j$, $1 ≤ j ≤ m$. The product $g↓1g↓2
\ldotsm g↓m$ is therefore zero for all $(q↓n, \ldotss , q↓0)$
in $R↓1 ∪ \cdots ∪ R↓m$.
\ansno 30. Starting with the construction in Theorem↔M\null, we will
prove that $m↓p + (1 - \delta ↓{0m↓c})$ of the $β$'s
may effectively be eliminated: If $\mu↓i$ corresponds to
a parameter multiplication, we have $\mu ↓i = β↓{2i-1} \times
(T↓{2i} + β↓{2i})$; add $cβ↓{2i-1}β↓{2i}$ to each $β↓j$ for
which $c\mu ↓i$ occurs in $T↓j$, and replace $β↓{2i}$ by zero.
This removes one parameter for each parameter multiplication. If $\mu ↓i$ is the
first chain multiplication, then $\mu ↓i$ is the first chain
multiplication, then $\mu ↓i = (\gamma ↓1x + \theta ↓1 + β↓{2i-1})
\times (\gamma ↓{2x} + \theta ↓2 + β↓{2i})$, where $\gamma ↓1$,
$\gamma ↓2$, $\theta ↓1$, $\theta ↓2$ are polynomials in $β↓1$, $\ldotss
$, $β↓{2i-2}$ with integer coefficients. Here $\theta ↓1$ and
$\theta ↓2$ can be ``absorbed'' into $β↓{2i-1}$ and $β↓{2i}$,
respectively, so we may assume that $\theta ↓1 = \theta ↓2 =
0$. Now add $cβ↓{2i-1}β↓{2i}$ to each $β↓j$ for which $c\mu
↓i$ occurs in $T↓j$; add $β↓{2i-1}\gamma ↓2/\gamma ↓1$ to $β↓{2i}$;
and set $β↓{2i-1}$ to zero. The result set is unchanged by this
elimination of $β↓{2i-1}$, except for the values of $α↓1$, $\ldotss
$, $α↓s$ such that $\gamma ↓1$ is zero.\xskip [This proof is essentially
due to V. \t Ia.\ \α{Pan,} {\sl Russian Mathematical Surveys} {\bf 21},1
(1966), 105--136.]\xskip The latter case can be handled as in
the proof of Theorem↔A\null, since the polynomials with $\gamma ↓1
= 0$ can be evaluated by eliminating $β↓{2i}$ (as in the first
construction, where $\mu ↓i$ corresponds to a parameter multiplication).
\ansno 31. Otherwise we could add one parameter multiplication
as a final step, and violate Theorem↔C\null.\xskip (The exercise is an
improvement over Theorem↔A\null, in this special case, since there
are only $n$ degrees of freedom in the coefficients of a monic
polynomial of degree $n$.)
\ansno 32. $λ↓1 = λ↓0 \times λ↓0$, $λ↓2 = α↓1 \times λ↓1$, $λ↓3
= α↓2 + λ↓2$, $λ↓4 = λ↓3 \times λ↓1$, $λ↓5 = α↓3 + λ↓4$. We need
at least three multiplications to compute $u↓4x↑4$ (see Section
4.6.3), and at least two additions by Theorem↔A.
\ansno 33. We must have $n + 1 ≤ 2m↓c + m↓p + \delta ↓{0m↓c}
$, and $m↓c + m↓p = (n + 1)/2$; so there are no parameter multiplications.
Now the first $λ↓i$ whose leading coefficient (as a polynomial
in $x$) is not an integer must be obtained by a chain addition;
and there must be at least $n + 1$ parameters, so there are
at least $n + 1$ parameter additions.
\ansno 34. Transform the given chain step by step, and also
define the ``content'' $c↓i$ of $λ↓i$, as follows:\xskip (Intuitively,
$c↓i$ is the leading coefficient of $λ↓i$.)\xskip Define $c↓0 = 1$.
\xskip (a) If the step has the form $λ↓i = α↓j + λ↓k$, replace it by
$λ↓i=β↓j+λ↓k$, where $β↓j=α↓j/c↓k$; and define $c↓i = c↓k$.\xskip (b) If the step
has the form $λ↓i = α↓j - λ↓k$, replace it by $λ↓i = β↓j + λ↓k$,
where $β↓j = -α↓j/c↓k$; and define $c↓i = -c↓k$.\xskip (c) If the
step has the form $λ↓i = α↓j \times λ↓k$, replace it by $λ↓i
= λ↓k$ (the step will be deleted later); and define $c↓i = α↓jc↓k$.\xskip
(d) If the step has the form $λ↓i = λ↓j \times λ↓k$, leave it
unchanged; and define $c↓i = c↓jc↓k$.
After this process is finished, delete all steps
of the form $λ↓i = λ↓k$, replacing $λ↓i$ by $λ↓k$ in each future
step that uses $λ↓j$. Then add a final step $λ↓{r+1} = β \times
λ↓r$, where $β = c↓r$. This is the desired scheme, since it
is easy to verify that the new $λ↓i$ are just the old ones divided
by the factor $c↓i$. The $β$'s are given functions of the $α$'s;
division by zero is no problem, because if any $c↓k = 0$ we
must have $c↓r = 0$ (hence the coefficient of $x↑n$ is zero),
or else $λ↓k$ never contributes to the final result.
\ansno 35. Since there are at least five parameter steps, the
result is trivial unless there is at least one parameter multiplication;
considering the ways in which three multiplications can form
$u↓4x↑4$, we see that there must be one parameter multiplication
and two chain multiplications. Therefore the four addition-subtractions
must each be parameter steps, and exercise↔34 applies. We can
now assume that only additions are used, and that we have a
chain to compute a general {\sl monic} fourth-degree polynomial
with {\sl two} chain multiplications and four parameter additions.
The only possible scheme of this type that calculates a fourth-degree
polynomial has the form
$$\baselineskip12pt\teqalign{λ↓1⊗= α↓1 + λ↓0\cr
λ↓2⊗= α↓2 + λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓3\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓6\cr}$$
Actually this chain has one addition too many,
but any correct scheme can be put into this form if we restrict
some of the $α$'s to be functions of the others. Now $λ↓7$
has the form $(x↑2 + Ax + B)(x↑2 + Ax + C) + D = x↑4 + 2Ax↑3
+ (E + A↑2)x↑2 + EAx + F$, where $A = α↓1 + α↓2$,
$B = α↓1α↓2 + α↓3$, $C = α↓1α↓2 + α↓4$, $D = α↓6$, $E = B + C$,
$F = BC + D$; and since this involves only three independent
parameters it cannot represent a general monic fourth-degree
polynomial.
\ansno 36. As in the solution to exercise↔35, we may assume that the chain computes
a general monic polynomial of degree six, using only three chain multiplications and
six parameter additions. The computation must take one of two general forms
$$\baselineskip12pt
\hbox to size{$\hfill\teqalign{λ↓1⊗=α↓1+λ↓0\cr
λ↓2⊗=α↓2+λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓0\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓6\cr
λ↓8⊗=α↓6+λ↓6\cr
λ↓9⊗=λ↓7\times λ↓8\cr
λ↓{10}⊗=α↓7+λ↓9\cr}\hfill
\hfill\teqalign{λ↓1⊗=α↓1+λ↓0\cr
λ↓2⊗=α↓2+λ↓0\cr
λ↓3⊗=λ↓1\timesλ↓2\cr
λ↓4⊗=α↓3+λ↓3\cr
λ↓5⊗=α↓4+λ↓3\cr
λ↓6⊗=λ↓4\timesλ↓5\cr
λ↓7⊗=α↓5+λ↓3\cr
λ↓8⊗=α↓6+λ↓6\cr
λ↓9⊗=λ↓7\times λ↓8\cr
λ↓{10}⊗=α↓7+λ↓9\cr}\hfill$}$$
where, as in exercise↔35, an extra addition has been
inserted to cover a more general case. Neither of these schemes
can calculate a general sixth-degree monic polynomial, since
the first case is a polynomial of the form
$$(x↑3 + Ax↑2 + Bx + C)(x↑3 + Ax↑2 + Bx + D) + E,$$
and the second case (cf.\ exercise↔35) is a polynomial of the form
$$\biglp x↑4 + 2Ax↑3 + (E + A↑2)x↑2 + EAx + F\bigrp (x↑2 +
Ax + G) + H;$$
both of these involve only five independent parameters.
\ansno 37. Let $u↑{[0]}(x) = u↓nx↑n + u↓{n-1}x↑{n-1} + \cdots
+ u↓0$, $v↑{[0]}(x) = x↑n + v↓{n-1}x↑{n-1} + \cdots + v↓0$. For
$1 ≤ j ≤ n$, divide $u↑{[j-1]}(x)$ by the monic polynomial $v↑{[j-1]}(x)$,
obtaining $u↑{[j-1]}(x) = α↓jv↑{[j-1]}(x) + β↓jv↑{[j]}(x)$.
Assume that a monic polynomial $v↑{[j]}(x)$ of degree $n - j$
exists satisfying this relation; this will be true for almost
all rational functions.
Let $u↑{[j]}(x) = v↑{[j-1]}(x) - xv↑{[j]}(x)$.
These definitions imply that deg$(u↑{[n]}) < 1$, so we may let
$α↓{n+1} = u↑{[n]}(x)$.
For the given rational function we have
$$\vbox{\halign{$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$⊗\qquad$\ctr{#}$\cr
α↓j⊗β↓j⊗v↑{[j]}(x)⊗u↑{[j]}(x)\cr
\noalign{\vskip3pt}
1⊗2⊗x + 5⊗3x + 19\cr
3⊗4⊗1⊗5\cr}}$$
so $u↑{[0]}(x)/v↑{[0]}(x) = 1 + 2/\biglp x + 3 + 4/(x + 5)\bigrp $.
{\sl Notes:} A general rational function of the stated
form has $2n + 1$ ``degrees of freedom,'' in the sense that
it can be shown to have $2n + 1$ essentially independent parameters.
If we generalize polynomial chains to ``\α{arithmetic chains},''
which allow division operations as well as addition, subtraction,
and multiplication, we can obtain the following results with
slight modifications to the proofs of Theorems A and↔M\null: {\sl
An arithmetic chain with $q$ addition-subtraction steps has at
most $q + 1$ degrees of freedom. An arithmetic chain with $m$
multiplication-division
steps has at most $2m + 1$ degrees of freedom.} Therefore an arithmetic
chain that computes almost all rational functions of the stated
form must have at least $2n$ addition-subtractions, and $n$
multiplication-divisions; the method in this exercise is ``optimal.''
%folio 835 galley 9a (C) Addison-Wesley 1978 *
\ansno 38. The theorem is certainly true if $n = 0$. Assume
that $n$ is positive, and that a polynomial chain computing
$P(x; u↓0, \ldotss , u↓n)$ is given, where each of the parameters
$α↓j$ has been replaced by a real number. Let $λ↓i = λ↓j \times
λ↓k$ be the first chain multiplication step that involves one
of $u↓0$, $\ldotss$, $u↓n$; such a step must exist because of the
rank of $A$. Without loss of generality, we may assume that
$λ↓j$ involves $u↓n$; thus, $λ↓j$ has the form $h↓0u↓0 + \cdots
+ h↓nu↓n + f(x)$, where $h↓0$, $\ldotss$, $h↓n$ are real, $h↓n ≠
0$, and $f(x)$ is a polynomial with real coefficients.\xskip$\biglp
$The $h$'s and the coefficients of $f(x)$ are derived from the
values assigned to the $α$'s.$\bigrp$
Now change step $i$ to $λ↓i = α \times λ↓k$, where
$α$ is an arbitrary real number.\xskip (We could take $α = 0$; general
$α$ is used here merely to show that there is a certain amount
of flexibility available in the proof.)\xskip Add further steps to
calculate
$$λ = \biglp α - f(x) - h↓0u↓0 - \cdots - h↓{n-1}u↓{n-1}\bigrp/h↓n;$$
these new steps involve only additions and parameter
multiplications (by suitable new parameters). Finally, replace
$λ↓{-n-1} = u↓n$ everywhere in the chain by this new element↔$λ$.
The result is a chain that calculates
$$Q(x; u↓0, \ldotss , u↓{n-1}) = P\biglp x; u↓0, \ldotss , u↓{n-1},
(α - f(x) - h↓0u↓0 - \cdots - h↓{n-1}u↓{n-1})/h↓n\bigrp ;$$
and this chain has one less chain multiplication.
The proof will be complete if we can show that $Q$ satisfies
the hypotheses. The quantity $\biglp α - f(x)\bigrp /h↓n$ leads
to a possibly increased value of $m$, and a new vector $B↑\prime
$. If the columns of $A$ are $A↓0$, $A↓1$, $\ldotss$, $A↓n$ (these
vectors being linearly independent over the reals), the new
matrix $A↑\prime$ corresponding to↔$Q$ has the column vectors
$$A↓0 - (h↓0/h↓n)A↓n,\qquad \ldotss ,\qquad A↓{n-1} - (h↓{n-1}/h↓n)A↓n,$$
plus perhaps a few rows of zeros to account for
an increased value of $m$, and these columns are clearly also
linearly independent. By induction, the chain that computes↔$Q$
has at least $n - 1$ chain multiplications, so the original
chain has at least $n$.
[Pan showed also that the use of division would give no improvement;
cf.\ {\sl Problemy Kibernetiki \bf7} (1962), 21--30.
Generalizations to the computation of several
polynomials in several variables, with and without various kinds
of preconditioning, have been given by S. \β{Winograd,} {\sl Comm.\
Pure and Applied Math.\ \bf 23} (1970), 165--179.]
\ansno 39. By induction on $m$. Let $w↓m(x) = x↑{2m} + u↓{2m-1}x↑{2m-1}
+ \cdots + u↓0$, $w↓{m-1}(x) = x↑{2m-2} + v↓{2m-3}x↑{2m-3} + \cdots
+ v↓0$, $a = α↓1 + \gamma ↓m$, $b = α↓m$, and let $$\textstyle f(r) = \sum ↓{i,j≥0}
(-1)↑{i+j}{i+j\choose j}u↓{r+i+2j\,}a↑ib↑j.$$It follows that $v↓r
= f(r + 2)$ for $r ≥ 0$, and $\delta ↓m = f(1)$. If $\delta
↓m = 0$ and $a$ is given, we have a polynomial of degree $m
- 1$ in $b$, with leading coefficient $\pm (u↓{2m-1} - ma) =
\pm (\gamma ↓2 + \cdots + \gamma ↓m - m\gamma ↓m)$.
In Motzkin's unpublished notes he arranged to
make $\delta ↓k = 0$ almost always, by choosing $\gamma$'s
so that this leading coefficient is $≠0$ when $m$ is
even and $=0$ when $m$ is odd; then we almost always can let $b$
be a (real) root of an odd-degree polynomial.
\ansno 40. No; S. Winograd found a way to compute all polynomials
of degree 13 with only 7 (possibly complex) multiplications
[{\sl Comm.\ Pure and Applied Math.\ \bf 25} (1972), 455--457].
L. \α{Revah} found schemes that evaluate almost all polynomials
of degree $n ≥ 9$ with $\lfloor n/2\rfloor + 1$ (possibly complex)
multiplications [{\sl SIAM J. Computing} {\bf 4} (1975), 381--392];
she also showed that when $n=9$ it is possible to achieve $\lfloor n/2\rfloor+1$
multiplications only with at least $n+3$ additions. By appending sufficiently
many additions (cf.\ exercise↔39), the ``almost all'' and ``possibly complex''
provisos disappear.\xskip V. \t Ia.\ \α{Pan} [{\sl Proc.\ ACM Symp.\
Theory Comp.\ \bf10} (1978), 162--172; IBM Research Report RC7754 (1979)]
found schemes with $\lfloor n/2\rfloor+1$ (complex)
multiplications and the minimum number $n+2+\delta↓{n9}$ of (complex) additions, for
all odd $n≥9$; his method for $n=9$ is
$$\baselineskip14pt\cpile{v(x)=\biglp(x+α)↑2+β\bigrp(x+\gamma),\qquad
w(x)=v(x)+x,\cr t(x)=\biglp v(x)+\delta\bigrp\biglp w(x)+ε\bigrp-\biglp
v(x)+\delta↑\prime\bigrp\biglp w(x)+ε↑\prime\bigrp,\cr
u(x)=\biglp v(x)+\zeta\bigrp\biglp t(x)+\eta\bigrp+\kappa.\cr}$$
The minimum number of {\sl real\/} additions necessary, when the minimum number of
(real) multiplications is achieved, remains unknown for $n≥9$.
\ansno 41. $a(c + d) - (a + b)d + i\biglp a(c + d) + (b - a)c\bigrp$.\xskip
[Beware numerical instability. Three multiplications are necessary, since complex
multiplication is a special case of (69) with $p(u)=u↑2+1$.
Without the restriction on additions there are
other possibilities. For example, the symmetric formula $ac-bd+i\biglp(a+b)(c+d)-ac-bd
\bigrp$ was suggested by Peter \α{Ungar} in 1963; cf.\ Eq.\ 4.3.3--2 with $2↑n$
replaced by $i$. See I. \α{Munro}, {\sl Proc.\ ACM Symp.\ Theory Comp.\ \bf3}
(1960), 40--44; S. Winograd, {\sl Linear Alg.\ Appl.\ \bf4} (1971), 381--388.]
Alternatively, if $a↑2+b↑2=1$ and $t=(1-a)/b=b/(1+a)$, the algorithm ``$w=c-td$,
$v=d+bw$, $u=w-tv$'' for calculating the product $(a+bi)(c+di)=u+iv$ has been
suggested by Oscar \α{Buneman} [{\sl J. Comp. Phys.\ \bf12} (1973), 127--128]. In
this method if $a=\cos\theta$ and $b=\sin\theta$, we have $t=\tan(\theta/2)$.
[Helmut \α{Alt} and Jan \α{van Leeuwen} have shown that four real multiplications
or divisions are necessary for computing $1/(a+bi)$, and four are sufficient
\inx{division of complex numbers}
for computing $a/(b+ci)$. It is unknown whether $(a+bi)/(c+di)$ can be computed
with only five multiplications or divisions.] % to appear
\ansno 42. (a)\9 Let $π↓1$, $\ldotss$, $π↓m$ be the $λ↓i$'s that correspond
to chain multiplications; then $π↓i = P↓{2i-1} \times P↓{2i}$
and $u(x) = P↓{2m+1}$, where each $P↓j$ has the form $β↓j +
β↓{j0}x + β↓{j1}π↓1 + \cdots + β↓{jr(j)}π↓{r(j)}$, where $r(j)
≤ \lceil j/2\rceil - 1$ and each of the $β↓j$ and $β↓{jk}$ is
a polynomial in the $α$'s with integer coefficients. We can
systematically modify the chain (cf.\ exercise↔30) so that $β↓j
= 0$ and $β↓{jr(j)} = 1$, for $1 ≤ j ≤ 2m$; furthermore we can
assume that $β↓{30} = 0$. The result set now has at most $m
+ 1 + \sum ↓{1≤j≤2m} (\lceil j/2\rceil - 1) = m↑2 + 1$ degrees
of freedom.
(b)\9 Any such polynomial chain with at most $m$
chain multiplications can be simulated by one with the form
considered in (a), except that now we let $r(j) = \lceil j/2\rceil
- 1$ for $1 ≤ j ≤ 2m + 1$, and we do not assume that $β↓{30}
= 0$ or that $β↓{jr(j)} = 1$ for $j ≥ 3$. This single canonical
form involves $m↑2 + 2m$ parameters. As the $α$'s run through all integers
and as we run through all chains, the $β$'s run through at most
$2↑{m↑2+2m}$ sets of values mod 2, hence the result
set does also. In order to obtain all $2↑n$ polynomials of degree
$n$ with 0--1 coefficients, we need $m↑2 + 2m ≥ n$.
(c)\9 Set $m ← \lfloor \sqrt{n}\rfloor$ and compute
$x↑2$, $x↑3$, $\ldotss$, $x↑m$. Let $u(x) = u↓{m+1}(x)x↑{(m+1)m} +
\cdots + u↓1(x)x↑m + u↓0(x)$, where each $u↓j(x)$ is a polynomial
of degree $≤m$ with integer coefficients (hence it can be evaluated
without any more multiplications). Now evaluate $u(x)$ by rule
(2) as a polynomial in $x↑m$ with known coefficients.\xskip (The number
of additions used is approximately the sum of the absolute values
of the coefficients, so this algorithm is efficient on 0--1
polynomials. Paterson and Stockmeyer also gave another algorithm
that uses about $\sqrt{2n}$ multiplications.)
Reference: {\sl SIAM J. Computing \bf2} (1973), 60--66; see also
J. E. \α{Savage,} {\sl SIAM J. Computing \bf3} (1974), 150--158.
For analogous results about
additions, see \α{Borodin} and \α{Cook},
{\sl SIAM J. Computing \bf5} (1976), 146--157; \α{Rivest} and \α{Van de
Wiele}, {\sl Inf.\ Proc.\ Letters \bf8} (1979), 178--180.]
\ansno 43. When $a↓i = a↓j + a↓k$ is a step in some optimal
addition chain for $n + 1$, compute $x↑i = x↑jx↑k$ and $p↓i=
p↓kx↑j + p↓j$, where $p↓i = x↑{i-1} + \cdots + x + 1$; omit
the final calculation of $x↑{n+1}$. We save one multiplication
whenever $a↓k = 1$, in particular when $i = 1$.\xskip $\biglp$Cf.\
exercise 4.6.3--31 with $ε = {1\over 2}$.$\bigrp$
\ansno 44. It suffices to show that $(T↓{ijk})$'s rank is {\sl at most} that of
$(t↓{ijk})$, since we can obtain $(t↓{ijk})$ back from $(T↓{ijk})$ by transforming
it in the same way with $F↑{-1}$, $G↑{-1}$, $H↑{-1}$. If $t↓{ijk}=\sum↓{1≤l≤r}
a↓{il\,}b↓{jl\,}c↓{kl}$ then it follows immediately that
$$\textstyle T↓{ijk}=\sum↓{1≤l≤r}\biglp\sum↓{1≤p≤m}F↓{ip\,}a↓{pl}\bigrp
\biglp\sum↓{1≤q≤n}G↓{jq\,}b↓{ql}\bigrp
\biglp\sum↓{1≤r≤s}H↓{kr\,}c↓{rl}\bigrp.$$
[H. F. \α{de Groote} has proved that all normal schemes that yield $2\times2$ matrix
products with seven chain multiplications are equivalent, in the sense that they can
be obtained from each other by nonsingular matrix multiplication as in this
exercise. In this sense \α{Strassen}'s algorithm is unique.]
\ansno 45. By exercise↔44 we can add any multiple of a row, column, or plane to
another one without changing the rank; we can also multiply a row, column,
or plane by a nonzero constant, or transpose the tensor. A sequence of such
operations can always be found to reduce a give $2\times2\times2$ tensor to one
of the forms ${0\,0\choose0\,0}{0\,0\choose0\,0}$, ${1\,0\choose0\,0}{0\,0\choose
0\,0}$, ${1\,0\choose0\,1}{0\,0\choose0\,0}$, ${1\,0\choose0\,0}{0\,0\choose0\,1}$,
${1\,0\choose0\,1}{0\,1\choose q\,r}$. The last tensor has rank 3 or 2 according as
the polynomial $u↑2-ru-q$ has one or two irreducible factors in the field of
interest, by Theorem↔W $\biglp$cf.↔(72)$\bigrp$.
\ansno 46. A general $m\times n\times s$ tensor has $mns$ degrees of freedom. By
exercise↔28 it is impossible to express all $m\times n\times s$ tensors in terms
of the $(m+n+s)r$ elements of a realization $A$, $B$, $C$ unless $(m+n+s)r≥mns$.
On the other hand, assume that $m≥n≥s$. The rank of an $m\times n$ matrix is at most
$n$, so we can realize any tensor in $ns$ chain multiplications by realizing each
matrix plane separately.\xskip[Exercise↔45 shows that this lower bound on the maximum
tensor rank is not best possible, nor is the upper bound. Thomas D. \α{Howell}
(Ph.\ D. thesis, Cornell Univ., 1976) has shown that there are tensors of rank
$≥\lceil mns/(m+n+s-2)\rceil$ over the complex numbers.]
\ansno 48. If $A$, $B$, $C$ and $A↑\prime$, $B↑\prime$, $C↑\prime$ are realizations
of $(t↓{ijk})$ and $(t↓{ijk}↑\prime)$ of respective lengths $r$ and $r↑\prime$, then
$A↑{\prime\prime}=A\oplus A↑\prime$,
$B↑{\prime\prime}=B\oplus B↑\prime$,
$C↑{\prime\prime}=C\oplus C↑\prime$,
and
$A↑{\prime\prime\prime}=A\otimes A↑\prime$,
$B↑{\prime\prime\prime}=B\otimes B↑\prime$,
$C↑{\prime\prime\prime}=C\otimes C↑\prime$,
are realizations of $(t↓{ijk}↑{\prime\prime})$ and $(t↓{ijk}↑{\prime\prime\prime})$
of respective lengths $r+r↑\prime$ and $r\cdot r↑\prime$.
{\sl Note:} Many people have made the natural conjecture that \def\\{\hbox{rank}}
$\\\biglp(t↓{ijk})\oplus(t↓{ijk}↑\prime)\bigrp=\\(t↓{ijk})+\\(t↓{ijk}↑\prime)$,
but the construction in exercise 60(b) makes this seem much less plausible than
it once was.
\ansno 49. By Lemma↔T\null, rank$(t↓{ijk})≥\hbox{rank}(t↓{i(jk)})$. Conversely if
$M$ is a matrix of rank $r$ we can transform it by row and column operations,
finding nonsingular matrices $F$ and $G$ such that $FMG$ has all entries 0 except
for $r$ diagonal elements that are↔1; cf.\ Algorithm↔4.6.2N\null. The tensor rank
of $FMG$ is therefore $≤r$; and it is the same as the tensor rank of $M$, by
exercise↔44.
\ansno 50. Let $i=\langle i↑\prime,i↑{\prime\prime}\rangle$ where $1≤i↑\prime≤m$
and $1≤i↑{\prime\prime}≤n$; then $t↓{\langle i↑\prime,i↑{\prime\prime}\rangle jk}
=\delta↓{i↑{\prime\prime}j}\delta↓{i↑\prime k}$, and it is clear that
rank$(t↓{i(jk)})=mn$ since $(t↓{i(jk)})$ is a permutation matrix. By Lemma↔L\null,
rank$(t↓{ijk})≥mn$. Conversely, since $(t↓{ijk})$ has only $mn$ nonzero entries,
its rank is clearly $≤mn$.\xskip(There is consequently no normal scheme requiring
fewer than the $mn$ obvious multiplications. There is no such abnormal scheme
either [{\sl Comm.\ Pure and Appl.\ Math.\ \bf3} (1970), 165--179]. But some
savings can be achieved if the same matrix is used with $s>1$ different column
vectors, since this is equivalent to $(m\times n)$ times $(n\times s)$ matrix
multiplication.)
\ansno 51. (a)\9 $s↓1=y↓0+y↓1$, $s↓2=y↓0-y↓1$; $m↓1={1\over2}(x↓0+x↓1)s↓1$,
$m↓2={1\over2}(x↓0-x↓1)s↓2$; $w↓0=m↓1+m↓2$, $w↓1=m↓1-m↓2$.\xskip(b) Here are some
intermediate steps, using the methodology in the text: $\biglp(x↓0-x↓2)+(x↓1-x↓2)u
\bigrp\biglp(y↓0-y↓2)+(y↓1-y↓2)u\bigrp\mod(u↑2+u+1)=\biglp(x↓0-x↓2)(y↓0-y↓2)-
(x↓1-x↓2)(y↓1-y↓2)\bigrp+\biglp(x↓0-x↓2)(y↓0-y↓2)-(x↓1-x↓0)(y↓1-y↓0)\bigrp u$.
The first realization is
$$\def\\#1.#2.#3.{\biggglp\,\vcenter{\vskip-4pt
\halign{##\cr\vbox to 9pt{}#1\cr#2\cr#3\cr}}\,\bigggrp}
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1.,\qquad
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1.,\qquad
\\1 1 1 \=2.1 1 \=2 1.1 \=2 1 1.\times{1\over3}.$$
The second realization is
$$\def\\#1.#2.#3.{\biggglp\,\vcenter{\vskip-4pt
\halign{##\cr\vbox to 9pt{}#1\cr#2\cr#3\cr}}\,\bigggrp}
\\1 1 1 \=2.1 1 \=2 1.1 \=2 1 1.\times{1\over3},\qquad
\\1 1 \=1 0.1 \=1 0 \=1.1 0 1 1.,\qquad
\\1 1 \=1 0.1 0 1 1.1 \=1 0 \=1..$$
The resulting algorithm computes $s↓1=y↓0+y↓1$, $s↓2=y↓0-y↓1$, $s↓3=y↓2-y↓0$,
$s↓4=y↓2-y↓1$, $s↓5=s↓1+y↓2$; $m↓1={1\over3}(x↓0+x↓1+x↓2)s↓5$,
$m↓2={1\over3}(x↓0+x↓1-2x↓2)s↓2$, $m↓3={1\over3}(x↓0-2x↓1+x↓2)s↓3$,
$m↓4={1\over3}(-2x↓0+x↓1+x↓2)s↓4$; $t↓1=m↓1+m↓2$, $t↓2=m↓1-m↓2$, $t↓3=m↓1+m↓3$,
$w↓0=t↓1-m↓3$, $w↓1=t↓3+m↓4$, $w↓2=t↓2-m↓4$.
\def\dprime{{\prime\prime}}
\ansno 52. Let $i=\langle i↑\prime,i↑\dprime\rangle$ when $i\mod n↑\prime=i↑\prime$
and $i\mod n↑\dprime=i↑\dprime$. Then we wish to compute $w↓{\langle k↑\prime,
k↑\dprime\rangle}=\sum x↓{\langle i↑\prime,i↑\dprime\rangle}y↓{\langle j↑\prime,
j↑\dprime\rangle}$ summed for $i↑\prime+j↑\prime≡k↑\prime\modulo{n↑\prime}$ and
$i↑\dprime+j↑\dprime≡k↑\dprime\modulo{n↑\dprime}$. This can be done by applying
the $n↑\prime$ algorithm to the $2n↑\prime$ vectors $X↓{i↑\prime}$ and $Y↓{j↑\prime
}$ of length $n↑\dprime$, obtaining the $n↑\prime$ vectors $W↓{k↑\prime}$. Each
vector addition becomes $n↑\dprime$ additions, each parameter multiplication
becomes $n↑\dprime$ parameter multiplications, and each chain multiplication of
vectors is replaced by a cyclic convolution of degree $n↑\dprime$.\xskip [If
the subalgorithms use the minimum number of chain multiplications, this algorithm
uses $2\biglp n↑\prime-d(n↑\prime)\bigrp\biglp n↑\dprime-d(n↑\dprime)\bigrp$ more
than the minimum, where $d(n)$ is the number of divisors of $n$.]
\ansno 53. (a)\9 Let $n(k)=(p-1)p↑{e-k-1}=\varphi(p↑{e-k})$ for $0≤k<e$, and $n(k)=
1$ for $k≥e$. Represent the numbers $\{1,\ldotss,m\}$ in the form $a↑ip↑k\modulo m$,
where $0≤k≤e$ and $0≤i<n(k)$,
and $a$ is a fixed primitive element modulo $p↑e$. For example,
when ${m=9}$ we can let $a=2$; the values are $\{2↑03↑0,2↑13↑0,2↑03↑1,2↑23↑0,2↑53↑0,
2↑13↑1,2↑43↑0,2↑33↑0,2↑03↑2\}$. Then $f(a↑ip↑k)=\sum↓{0≤l≤e}\sum↓{0≤j<n(l)}
\omega↑{g(i,j,k,l)}F(a↑jp↑l)$ where $g(i,j,k,l)=a↑{i+j}p↑{k+l}$.
We shall compute $f↓{ikl}=\sum↓{0≤j<n(l)}\omega↑{g(i,j,k,l)}F(a↑jp↑l)$ for $0≤i<
n(k)$ and for each $k$, $l$. This is a cyclic convolution of degree $n(k+l)$ on
the values $x↓i=\omega↑{a↑ip↑{k+l}}$ and $y↓s=\sum↓{0≤j<n(l),\,s+j≡0\,\modulo{n(k+l)}}
F(a↑jp↑l)$, since $f↓{ikl}=\sum x↓ry↓s$ summed over $r+s≡i$ $\biglp$modulo
$n(k+l)\bigrp$. The Fourier transform is obtained by summing appropriate
$f↓{ikl}$'s.\xskip[{\sl Note:} When linear combinations of the $x↓i$ are
formed, e.g., as in (67), the result will be purely real or purely imaginary,
when the cyclic convolution algorithm has been constructed by using rule (57) with
$u↑{n(k)}-1=(u↑{n(k)/2}-1)(u↑{n(k)/2}+1)$. The reason is that reduction mod
$(u↑{n(k)/2}-1)$ produces a polynomial with real coefficients $\omega↑j+\omega↑{-j}$
while reduction mod $(u↑{n(k)/2}+1)$ produces a polynomial with imaginary
coefficients $\omega↑j-\omega↑{-j}$.]
When $p=2$ an analogous construction applies, using the representation $(-1)↑ia↑j
2↑k\modulo m$, where $0≤k≤e$ and $0≤i≤\min(e-k,1)$ and $0≤j<2↑{e-k-2}$. In this
case we use the construction of exercise↔52 with $n↑\prime=2$ and $n↑\dprime=
2↑{e-k-2}$; although these numbers are not relatively prime, the construction does
yield the desired direct product of cyclic convolutions.
(b)\9 Let $a↑\prime m↑\prime+a↑\dprime m↑\dprime=1$; and let $\omega↑\prime=
\omega↑{a↑\dprime m↑\dprime}$, $\omega↑\dprime=\omega↑{a↑\prime m↑\prime}$. Define
$s↑\prime=s\mod m↑\prime$, $s↑\dprime=s\mod m↑\dprime$, $t↑\prime=t\mod m↑\prime$,
$t↑\dprime=t\mod m↑\dprime$, so that $\omega↑{st}=(\omega↑\prime)↑{s↑\prime
t↑\prime}(\omega↑\dprime)↑{s↑\dprime t↑\dprime}$. It follows that $f(s↑\prime,s↑
\dprime)=\sum↓{0≤t↑\prime<m↑\prime,\,0≤t↑\dprime<m↑\dprime}(\omega↑\prime)↑{s↑
\prime t↑\prime}(\omega↑\dprime)↑{s↑\dprime t↑\dprime}F(t↑\prime,t↑\dprime)$; in
other words, the one-dimensional Fourier transform on $m$ elements is actually a
two-dimensional Fourier transform on $m↑\prime\times m↑\dprime$ elements, in
slight disguise.
We shall deal with ``normal'' algorithms consisting of (i) a number of sums $s↓i$
of the $F$'s and $s$'s; followed by (ii) a number of products $m↓j$, each of which
is obtained by multiplying one of the $F$'s or $S$'s by a real or imaginary number
$α↓j$; followed by (iii) a number of further sums $t↓k$, each of which is formed
from $m$'s or $t$'s (not $F$'s or $s$'s). The final values must be $m$'s or $t$'s.
For example, the ``normal'' Fourier transform scheme for $m=5$ constructed from
(67) and the method of part (a) is as follows: $s↓1=F(1)+F(4)$, $s↓2=F(3)+F(2)$,
$s↓3=s↓1+s↓2$, $s↓4=s↓1-s↓2$, $s↓5=F(1)-F(4)$, $s↓6=F(2)-F(3)$, $s↓7=s↓5-s↓6$;
$m↓1={1\over4}(\omega+\omega↑2+\omega↑4+\omega↑3)s↓3$, $m↓2={1\over4}(\omega-
\omega↑2+\omega↑4-\omega↑3)s↓4$, $m↓3={1\over2}(\omega+\omega↑2-\omega↑4-\omega↑3)
s↓5$, $m↓4={1\over2}(-\omega+\omega↑2+\omega↑4-\omega↑3)s↓6$, $m↓5={1\over2}(
\omega↑3-\omega↑2)s↓7$, $m↓6=1\cdot F(5)$, $m↓7=1\cdot s↓3$; $t↓0=m↓1+m↓6$, $t↓1=
t↓0+m↓2$, $t↓2=m↓3+m↓5$, $t↓3=t↓0-m↓2$, $t↓4=m↓4-m↓5$, $t↓5=t↓1+t↓2$, $t↓6=t↓3+t↓4$,
$t↓7=t↓1-t↓2$, $t↓8=t↓3-t↓4$, $t↓9=m↓6+m↓7$. Note the multiplication by 1 shown in
$m↓6$ and $m↓7$; this is required by our conventions, and it is important to
include such cases for use in recursive constructions (although the multiplications
need not really be done). Here $m↓6=f↓{001}$, $m↓7=f↓{010}$, $t↓5=f↓{000}+f↓{001}
=f(2↑0)$, $t↓6=f↓{100}+f↓{101}=f(2↑1)$, etc. We can improve the scheme by
introducing $s↓8=s↓3+F(5)$, replacing $m↓1$ by $\biglp{1\over4}(\omega+\omega↑2
+\omega↑4+\omega↑3)-1\bigrp s↓3$ [this is $-{5\over4}s↓3$], replacing $m↓6$ by
$1\cdot s↓8$, and deleting $m↓7$ and $t↓9$; this saves one of the trivial
multiplications by 1, and it will be advantageous when the scheme is used to
build larger ones. In the improved scheme, $f(5)=m↓6$, $f(1)=t↓5$, $f(2)=t↓6$,
$f(3)=t↓8$, $f(4)=t↓7$.
Now suppose we have normal one-dimensional schemes for $m↑\prime$ and $m↑\dprime$,
using respectively $(a↑\prime,a↑\dprime)$ complex additions, $(t↑\prime,t↑\dprime)$
trivial multiplications by $\pm1$ or $\pm i$, and a total of $(c↑\prime,c↑\dprime)$
complex multiplications including the trivial ones.\xskip(The nontrivial complex
multiplications are all ``simple'' since they involve only two real multiplications
and no real additions.)\xskip We can construct a normal scheme for the
two-dimensional $m↑\prime\times m↑\dprime$ case by applying the $m↑\prime$ scheme
to vectors $F(t↑\prime,\ast)$ of length $m↑\dprime$. Each $s↓i$ step becomes
$m↑\dprime$ additions; each $m↓j$ becomes a Fourier transform on $m↑\dprime$
elements, but with all of the $α$'s in this algorithm multiplied by $α↓j$; and
each $t↓k$ becomes $m↑\dprime$ additions. Thus the new algorithm has $(a↑\prime
m↑\dprime+c↑\prime a↑\dprime)$ complex additions, $t↑\prime t↑\dprime$ trivial
multiplications, and a total of $c↑\prime c↑\dprime$ complex multiplications.
Using these techniques, Winograd has found normal one-dimensional schemes
for the following small values of $m$ with the following costs $(a,t,c)$:
$$\baselineskip13pt
\vbox{\halign{$m=#\hfill$⊗\quad$(\hfill#$,⊗$\,#,$⊗$\,\hfill#)$⊗\hskip 100pt
$m=#\hfill$⊗\quad$(\hfill#$,⊗$\,#,$⊗$\,\hfill#)$\cr
2⊗2⊗2⊗2⊗7⊗36⊗1⊗9\cr
3⊗6⊗1⊗3⊗8⊗26⊗6⊗8\cr
4⊗8⊗4⊗4⊗9⊗46⊗1⊗12\cr
5⊗17⊗1⊗6⊗16⊗74⊗8⊗18\cr}}$$
By combining these schemes as described above, we obtain methods that use fewer
arithmetic operations than the ``\α{fast Fourier transform}'' (FFT) discussed in
exercise↔14. For example, when $m=1008=7\cdot9\cdot16$, the costs come to
$(17946,8,1944)$, so we can do a Fourier transform on 1008 complex numbers with
3872 real multiplications and 35892 real additions.
It is possible to improve on Winograd's method for combining relatively prime
moduli by using multidimensional convolutions, as shown by \α{Nussbaumer} and
\inx{convolutions, multidimensional}
\α{Quandalle} in {\sl IBM J. Res.\ and Devel.\ \bf22} (1978), 134--144; their
ingenious approach reduces the amount of computation needed for 1008-point
complex Fourier transforms to 3084 real multiplications and 34668 real additions.
By contrast, the FFT on 1024
complex numbers involves 14344 real multiplications and 27652 real additions.
If the two-passes-at-once improvement in the answer to exercise↔14 is used,
however, the FFT on 1024 complex numbers needs only 10936 real multiplications and
25948 additions, and it is not difficult to implement. Therefore the subtler methods
are faster only on machines that take significantly longer to multiply than to add.
[References: {\sl Proc.\ Nat.\ Acad.\ Sci.\ USA \bf73} (1976), 1005--1006;
{\sl Math.\ Comp.\ \bf32} (1978), 175--199; {\sl Advances in Math.\ \bf32} (1979),
83--117.]
\ansno 54. $\max\biglp 2e↓1\hbox{deg}(p↓1)-1,\ldotss,2e↓q\hbox{deg}(p↓q)-1,q+1)$.
\ansno 55. $2n↑\prime-q↑\prime$, where $n↑\prime$ is the degree of the minimum
polynomial of $P$ (i.e., the monic
polynomial $\mu$ of least degree such that $\mu(P)$ is the zero matrix) and
$r↑\prime$ is the number of distinct irreducible factors it has.\xskip(Reduce $P$
by similarity transformations.)
\ansno 56. Let $t↓{ijk}+t↓{jik}=\tau↓{ijk}+\tau↓{jik}$, for all $i$, $j$, $k$. If
$A$, $B$, $C$ is a realization of $(t↓{ijk})$ of rank $r$, then $\lower6pt\null
\sum↓{1≤l≤r}c↓{kl\,}\biglp\sum a↓{il\,}x↓i\bigrp\biglp\sum b↓{jl\,}x↓j\bigrp=
\sum↓{i,j}t↓{ijk}x↓ix↓j=\sum↓{i,j}\tau↓{ijk}x↓ix↓j$ for all $k$. Conversely, let the
$l$th chain multiplication of a polynomial chain, for $1≤l≤r$, be the product
$\biglp α↓l+\sum α↓{il\,}x↓i\bigrp\biglp β↓l+\sum β↓{jl}x↓j\bigrp$, where
$α↓l$ and $β↓l$ denote possible constant terms and/or nonlinear terms. All terms
of degree 2 appearing at any step of the chain can be expressed as a linear
combination $\sum↓{1≤l≤r}c↓{l\,}\biglp\sum a↓{il\,}x↓i\bigrp\biglp\sum
b↓{jl\,}x↓j\bigrp$; hence the chain defines a tensor $(t↓{ijk})$ of rank $≤r$
such that $t↓{ijk}+t↓{jik}=\tau↓{ijk}+\tau↓{jik}$. This establishes the hint. Now
\def\\{\hbox{rank}}$\\(\tau↓{ijk}+\tau↓{jik})=\\(t↓{ijk}+t↓{jik})≤\\(t↓{ijk})+\\(t↓
{jik})=2\,\\(t↓{ijk})$.
A bilinear form in $x↓1$, $\ldotss$, $x↓m$, $y↓1$, $\ldotss$, $y↓n$ is a
quadratic form in $m+n$ variables, where $\tau↓{ijk}=t↓{i,j-m,k}$ for $i≤m$ and $j>m$,
otherwise $\tau↓{ijk}=0$. Now $\\(\tau↓{ijk})+\\(\tau↓{jik})≥\\(t↓{ijk})$, since we obtain
a realization of $(t↓{ijk})$ by suppressing the last $n$ rows of $A$ and the first
$m$ rows of $B$ in a realization $A$, $B$, $C$ of $(\tau↓{ijk}+\tau↓{jik})$.
\ansno 57. Let $N$ be the smallest power of 2 that exceeds
$2n$, and let $u↓{n+1} = \cdots = u↓{N-1} = v↓{n+1} = \cdots
= v↓{N-1} = 0$. If $U↓s = \sum ↓{0≤t<N} \omega ↑{st}u↓t$, $V↓s
= \sum ↓{0≤t<N} \omega ↑{st}v↓t$, ${0 ≤ s < N}$, $\omega = e↑{2πi/N}$,
then $\sum ↓{0≤s<N} \omega ↑{-st}U↓sV↓s = N \sum u↓{t↓1}v↓{t↓2}$, where
the latter sum is taken over all $t↓1$ and $t↓2$
with $0 ≤ t↓1, t↓2 < N$, $t↓1 + t↓2 ≡ t\modulo N$. The terms
vanish unless $t↓1 ≤ n$ and $t↓2 ≤ n$, so $t↓1 + t↓2 < N$; thus the
sum is the coefficient of $z↑t$ in the product $u(z)v(z)$. If
we use the method of exercise↔14 to compute the Fourier transforms and
the inverse transforms, the number of complex operations is
$O(N\log N) + O(N \log N) + O(N) + O(N\log N)$; and $N
< 4n$.\xskip [Cf.\ Section 4.3.3 and the paper by J. M. \α{Pollard,} {\sl Math.\
Comp.\ \bf 25} (1971), 365--374.]
When multiplying integer polynomials, it is possible to use an {\sl integer} number
$\omega$ that is of order $2↑t$ modulo a prime $p$, and to
determine the results modulo sufficiently many primes. Useful
\inx{primes, useful}
primes in this regard, together with their least primitive roots↔$r$
(from which we take $\omega = r↑{(p-1)/2↑t}\mod
p$ when $p \mod 2↑t = 1)$, can be found as described in Section
4.5.4. For $t = 9$, the ten largest cases $<2↑{35}$ are $p = 2↑{35}
- 512a + 1$, where $(a, r) = (28, 7)$, $(31, 10)$, $(34, 13)$, $(56,
3)$, $(58, 10)$, $(76, 5)$, $(80, 3)$, $(85, 11)$, $(91, 5)$, $(101, 3)$;
the ten largest cases $<2↑{31}$ are $p = 2↑{31} - 512a + 1$, where
$(a, r) = (1, 10)$, $(11, 3)$, $(19, 11)$, $(20, 3)$, $(29, 3)$, $(35,
3)$, $(55, 19)$, $(65, 6)$, $(95, 3)$, $(121, 10)$. For larger $t$,
all primes $p$ of the form $2↑tq + 1$ where $q < 32$ is odd
and $2↑{24} < p < 2↑{36}$ are given by $(p - 1, r) = (11 \cdot
2↑{21}, 3)$, $(25 \cdot 2↑{20}, 3)$, $(27 \cdot 2↑{20}, 5)$, $(25
\cdot 2↑{22}, 3)$, $(27 \cdot 2↑{22}, 7)$, $(5 \cdot 2↑{25}, 3)$,
$(7 \cdot 2↑{26}, 3)$, $(27 \cdot 2↑{26},
13)$, $(15 \cdot 2↑{27}, 31)$, $(17 \cdot 2↑{27}, 3)$, $(3 \cdot 2↑{30},
5)$, $(13 \cdot 2↑{28}, 3)$, $(29 \cdot 2↑{27}, 3)$, $(23 \cdot
2↑{29}, 5)$. Some of the latter primes can be used with $\omega
= 2↑e$ for appropriate small $e$. For a discussion of such primes, see R. M.
\α{Robinson,} {\sl Proc.\ Amer.\ Math.\ Soc.\ \bf9} (1958), 673--681; S. W. \α{Golomb},
{\sl Math.\ Comp.\ \bf30} (1976), 657--663.
However, the method of exercise 59 will almost always be preferable in practice.
\ansno 58. (a)\9 In general if $A$, $B$, $C$ realizes $(t↓{ijk})$, then
$(x↓1,\ldotss,x↓m)A$, $B$, $C$ is a realization of the $1\times n\times s$ matrix
whose entry in row $j$, column $k$ is $\sum x↓it↓{ijk}$. So there must be at
least as many nonzero elements in $(x↓1,\ldotss,x↓m)A$ as the rank of this matrix.
In the case of the $m\times n\times(m+n-1)$ tensor corresponding to polynomial
multiplication of degree $m-1$ by degree $n-1$, the corresponding matrix has
rank $n$ whenever $(x↓1,\ldotss,x↓m)≠(0,\ldotss,0)$. A similar statement holds
with $A\xch B$ and $m\xch n$.\xskip[In particular, if we work over the field of
2 elements, this says that the rows of $A$ modulo 2 form a ``linear code'' of
$m$ vectors having distance at least $n$, whenever $A$, $B$, $C$ is a realization
consisting entirely of integers. This observation, due to R. W. \α{Brockett} and
D. \α{Dobkin} [{\sl Linear Algebra and its Applic.\ \bf19} (1978), 207--235,
Theorem↔14], can be used to obtain nontrivial lower bounds on the rank over the integers.
For example, M.↔R. \α{Brown} and D. Dobkin have used it to show that realizations
of $n\times n$ polynomial multiplication over the integers must have $t≥3.52n$,
for all sufficiently large $n$; see {\sl IEEE Trans.\ \bf C--29} (1980),
337--340.]
\yskip
(b) $\left(\vcenter{\halign{#\cr
1 0 0 0 0 1 1 1\cr 0 1 0 0 1 1 0 1\cr 0 0 1 1 0 0 1 1\cr}}\right)$,
$\left(\vcenter{\halign{#\cr
1 0 0 0 0 1 1 1\cr 0 1 0 0 0 1 0 1\cr 0 0 1 0 0 0 1 1\cr 0 0 0 1 1 0 0 1\cr}}
\right)$,
$\left(\vcenter{\halign{#\cr 1 0 0 0 0 0 0 0\cr
\=1 \=1 0 0 0 1 0 0\cr \=1 1 \=1 0 0 0 1 0\cr 1 0 0 \=1 \=1 \=1 \=1 1\cr
0 0 1 0 1 0 0 0\cr 0 0 0 1 0 0 0 0\cr}}\right)$.
\ansno 59. [{\sl IEEE Trans.\ \bf ASSP--28} (1980), 205--215.]\xskip
Note that cyclic convolution is polynomial multiplication mod $u↑n-1$, and
negacyclic convolution is \α{polynomial multiplication} mod $u↑n+1$.
Let us now change notation, replacing $n$ by $2↑n$; we shall consider
\α{recursive} algorithms for cyclic and negacyclic convolution $(z↓0,\ldotss,z↓{2↑n-1})$
of $(x↓0,\ldotss,x↓{2↑n-1})$ with $(y↓0,\ldotss,y↓{2↑n-1})$. The algorithms
are presented in unoptimized form, for brevity and ease in exposition; the reader
who implements them will notice that many things can be streamlined.
\algstep C1. [Test for simple case.] If $n=1$, set $z↓0←x↓0y↓0+x↓1y↓1$,
$z↓1←(x↓0+x↓1)(y↓0+y↓1)-z↓0$, and terminate. Otherwise set $m←2↑{n-1}$.
\algstep C2. [Remainderize.] For $0≤k<m$, set $(x↓k,x↓{m+k})←(x↓k+x↓{m+k},
x↓k-x↓{m+k})$ and $(y↓k,y↓{m+k})←(y↓k+y↓{m+k},y↓k-y↓{m+k})$.\xskip
$\biglp$Now we have $x(u)\mod(u↑m-1)=x↓0+\cdots+x↓{m-1}u↑{m-1}$ and $x(u)\mod(u↑m+1)=
x↓m+\cdots+x↓{2m-1}$; we will compute $x(u)y(u)\mod(u↑m-1)$ and $x(u)y(u)\mod
(u↑m+1)$, then we will combine the results by (57).$\bigrp$
\jpar1000
\algstep C3. [Recurse.] Set $(z↓0,\ldotss,z↓{m-1})$ to the cyclic convolution
of $(x↓0,\ldotss,x↓{m-1})$ with $(y↓0,\ldotss,y↓{m-1})$. Also set
$(z↓m,\ldotss,z↓{2m-1})$ to the negacyclic convolution of $(x↓m,\ldotss,x↓{2m-1})$ with $(y↓m,\ldotss,
y↓{2m-1})$.
\jpar2
\algstep C4. [Unremainderize.] For $0≤k<m$, set $(z↓k,z↓{m+k})←
{1\over2}(z↓k+z↓{m+k},z↓k-z↓{m+k})$. Now $(z↓0,\ldotss,z↓{m-1})$ is the
desired answer.\quad\blackslug
\yskip
\algstep N1. [Test for simple case.] If $n=1$, set $t←x↓0(y↓0+y↓1)$, $z↓0←t-(x↓0+
x↓1)y↓1$, $z↓1←t+(x↓1-x↓0)y↓0$, and terminate. Otherwise set $m←2↑{\lfloor n/2
\rfloor}$ and $r←2↑{\lceil n/2\rceil}$.\xskip
$\biglp$The following steps use $2↑{n+1}$ auxiliary variables $X↓{ij}$
for $0≤i<2m$ and $0≤j<r$, to represent $2m$ polynomials $X↓i(w)=X↓{i0}+X↓{i1}w+
\cdots+X↓{i(r-1)}w↑{r-1}$; similarly, there are $2↑{n+1}$ auxiliary variables↔$Y↓{ij}$.$\bigrp$
\algstep N2. [Initialize auxiliary polynomials.] Set $X↓{ij}←X↓{(i+m)j}←x↓{mj+i}$
$Y↓{ij}←Y↓{(i+m)j}←y↓{mj+i}$, for $0≤i<m$ and $0≤j<r$.\xskip
$\biglp$At this point we have $x(u)=X↓0(u↑m)+uX↓1(u↑m)+\cdots+u↑{m-1}X↓{m-1}(u↑m)$,
and a similar formula holds for↔$y(u)$. Our strategy will be to multiply these
polynomials modulo $(u↑{mr}+1)=(u↑n+1)$, by operating modulo $(w↑r+1)$ on the
polynomials $X(w)$ and $Y(w)$, finding their cyclic correlation of length
$2m$ and thereby obtaining $x(u)y(u)=Z↓0(u↑m)+uZ↓1(u↑m)+\cdots+u↑{2m-1}Z↓{2m-1}(u↑m)$.$\bigrp$
\jpar100
\algstep N3. [Transform.] $\biglp$Now we will essentially do a \α{fast Fourier
transform} on the polynomials $(X↓0,\ldotss,X↓{m-1},0,\ldotss,0)$ and
$(Y↓0,\ldotss,Y↓{m-1},0,\ldotss,0)$, using $w↑{r/m}$ as a $(2m)$th root of unity.
This is efficient, because multiplication by a power of $w$ is not really a
multiplication at all.$\bigrp$ For $j=\lfloor n/2\rfloor-1$, $\ldotss$, 1,↔0
(in this order), do the following for all $m$ binary numbers $s+t=
(s↓{\lfloor n/2\rfloor}\ldotsm s↓{j+1}0\ldotsm0)↓2+(0\ldotsm0t↓{j-1}\ldotsm t↓0)↓2$:
Replace $\biglp X↓{s+t}(w),X↓{s+t+2↑j}(w)\bigrp$ by the pair of polynomials
$\biglp X↓{s+t}(w)+w↑{(r/m)(s/2)}X↓{s+t+2↑j}(w),
X↓{s+t}(w)-w↑{(r/m)(s/2)}X↓{s+t+2↑j}(w)\bigrp$.\xskip
$\biglp$See Section 4.3.3 and
Eq.\ 4.3.3--33. The operation $X↓i(w)←X↓i(w)+w↑kX↓l(w)$ means,
more precisely, that we set $X↓{ij}←X↓{ij}+X↓{l(j+k)}$ if $j+k<r$, otherwise
$X↓{ij}←X↓{ij}-X↓{l(j+k-r)}$, for $0≤j<r$.$\bigrp$ Do the same transformation
on the $Y$'s.
\jpar2
\algstep N4. [Recurse.] For $0≤i<2m$, set $(Z↓{i0},\ldotss,Z↓{i(r-1)})$ to the
negacyclic convolution of
$(X↓{i0},\ldotss,X↓{i(r-1)})$ and
$(Y↓{i0},\ldotss,Y↓{i(r-1)})$.
\algstep N5. [Untransform.] For $j\!=\!0$, 1, $\ldotss$, $\lfloor n/2\rfloor$
(in this order), set $\biglp Z↓{s+t}(w),Z↓{s+t+2↑j}(w)\bigrp\!←{1\over2}\biglp
Z↓{s+t}(w)+Z↓{s+t+2↑j}(w),
w↑{-(r/m)(s/2)}(Z↓{s+t}(w)-Z↓{s+t+2↑j}(w))\bigrp$, for all $m$ choices of
$s$ and $t$ as in step↔N3.
\algstep N6. [Repack.] (Now we have accomplished the goal stated at the end of
step↔N2, since it is easy to show that the transform of the $Z$'s is the
product of the transforms of the $X$'s and the $Y$'s.)\xskip
Set $z↓i←Z↓{i0}-Z↓{(m+i)(r-1)}$ and $z↓{mj+i}←Z↓{ij}+Z↓{(m+i)(j-1)}$ for
$0<j<r$, for $0≤i<m$.\quad\blackslug
\yyskip
It is easy to verify that at most $n$ extra bits of precision are needed for the
intermediate variables in this calculation; i.e., if $|x↓i|≤M$ for $0≤i<2↑n$
at the beginning of the algorithm, then all of the $x$ and $X$ variables will be
bounded by $2↑nM$ throughout.
Algorithm N performs $A↓n$ addition-subtractions, $D↓n$ halvings, and
$M↓n$ multiplications, where $A↓1=5$, $D↓1=0$, $M↓1=3$; for $n>1$ we
have $A↓n=\lfloor n/2\rfloor2↑{n+2}+2↑{\lfloor n/2\rfloor+1}
A↓{\lfloor n/2\rfloor}+\biglp\lfloor n/2\rfloor+1\bigrp2↑{n+1}+2↑n$,
$D↓n=2↑{\lfloor n/2\rfloor+1}D↓{\lfloor n/2\rfloor}+\biglp\lfloor n/2\rfloor+1\bigrp2↑{n+1}$,
and $M↓n=2↑{\lfloor n/2\rfloor+1}M↓{\lfloor n/2\rfloor}$. The solutions are
$A↓n=11\cdot2↑{n-1+\lceil\lg n\rceil}-3\cdot2↑n+6\cdot2↑nS↓n$,
$D↓n=4\cdot2↑{n-1+\lceil\lg n\rceil}-2\cdot2↑n+2\cdot2↑nS↓n$,
$M↓n=3\cdot2↑{n-1+\lceil\lg n\rceil}$; here $S↓n$ satisfies the recurrence
$S↓1=0$, $S↓n=2S↓{\lceil n/2\rceil}+\lfloor n/2\rfloor$, and it is not difficult
\inx{analysis of algorithms}
to prove the inequalities
${1\over2}n\lceil\lg n\rceil≤S↓n≤{1\over2}n\lg n+n$. Algorithm↔C
does approximately the same amount of work as Algorithm↔N.
It would be interesting to find a simpler way to carry out the additions and
subtractions in step↔N3 (and the reverse operations in↔N5), perhaps analogous
to \α{Yates}'s method. The operation $X↓i←X↓i+w↑kX↓l$ sketched
above can be done
with a procedure that generalizes the data-rotation algorithm of \α{Fletcher} and
\α{Silver} in {\sl CACM \bf9} (1966), 326, but there might be a better way.
\ansno 60. (a)\9 In $\Sigma↓1$, for example, we can group all terms having a
common value of $j$ and↔$k$ into a single trilinear term; this gives $\nu↑2$
trilinear terms when $(j,k)\in E{\times}E$, plus $\nu↑2$ when $(j,k)\in E{\times}O$
and $\nu↑2$ when $(j,k)\in O{\times}E$. When $\s\jit=k$ we can also include
$-x↓{jj\,}y↓{j\s\jit\,}z↓{\s\jit j\,}$ in $\Sigma↓1$, free of charge.\xskip
[In the case $n=10$, the method multiplies 10 by 10 matrices with 710 noncommutative
multiplications; this is fewer than required by any other known method, although
\α{Winograd}'s scheme (35) uses only 600 when commutativity is allowed.]
(b)\9 Here we simply let $S$ be all the indices $(i,j,k)$ of one problem,
$\s S$ the indices of the other.\xskip[When $m=n=s=10$, the result is quite
surprising: We can multiply two separate $10\times10$ matrices with 1300
noncommutative multiplications, while no scheme is known that would multiply
each of them with↔650.]
(c)\9 Corresponding to the left-hand side of the stated identity we get the
terms$$\!x↓{i+ε,j+\zeta\,}y↓{j+\zeta,k+\eta\,}z↓{k+\eta,i+ε}+x↓{j+\eta,k+ε\,}
y↓{k+ε,i+\zeta\,}z↓{i+\zeta,j+\eta}+x↓{k+\zeta,i+\eta\,}y↓{i+\eta,j+ε\,}
z↓{j+ε,k+\zeta}\!\hskip-.33pt$$
summed over $(i,j,k)\in S$ and $0≤ε,\zeta,\eta≤1$, so we get all
the trilinear terms of the form $x↓{ij\,}y↓{jk\,}z↓{ki}$ except when $\lceil i/2
\rceil=\lceil j/2\rceil=\lceil k/2\rceil$; however, these missing terms can
all be included in $\Sigma↓1$, $\Sigma↓2$, or $\Sigma↓3$. The sum $\Sigma↓1$
turns out to include terms of the form $x↓{i+ε,j+\zeta\,}y↓{i+\eta,j+ε}$ times
some sum of $z$'s, so it contributes $8\nu↑2$ terms to the trilinear realization;
and $\Sigma↓2$, $\Sigma↓3$ are similar. To verify that the $aB\Cscr$ terms
cancel out, note that they are $\sum(-1)↑{\zeta+\eta}x↓{i+ε,j+\zeta\,}
y↓{k+ε,i+\zeta\,}z↓{j+ε,k+\zeta}$, so $\eta=1$ cancels with $\eta=0$.\xskip
[This technique leads to asymptotic improvements over Strassen's method whenever
${1\over3}n↑3+6n↑2-{4\over3}n<n↑{\lg7}$, namely when $36≤n≤184$, and it was the first
construction known to break the $\lg7$ barrier. Reference: {\sl SIAM J.
Computing \bf9} (1980), 321--342.]
\def\\{\mathop{\hbox{rank}}}
\ansno 61. (a)\9 Replace $a↓{ij}(u)$ by $ua↓{il}(u)$.\xskip
(b) Let $a↓{il}(u)=a↓{il\mu}u↑\mu$, etc., in a polynomial realization of
length $r=\\↓d(t↓{ijk})$. Then $t↓{ijk}=\sum↓{\mu+\nu+\sigma=d}\sum↓{1≤l≤r}a↓{il\mu}
b↓{jl\nu}c↓{kl\sigma}$.\xskip[This result can be improved to $\\(t↓{ijk})≤(2d+1)
\\↓d(t↓{ijk})$ in an infinite field, because the trilinear form
$\sum↓{\mu+\nu+\sigma}a↓\mu b↓\nu c↓\sigma$ corresponds to multiplication of
polynomials modulo $u↑{d+1}$, as pointed out by \α{Bini} and \α{Pan}.]\xskip
(c,d) This is clear from the realizations in exercise↔48.
\ansno 62. The rank is 3, by the method of proof in Theorem↔W with $P=\left(
{0\atop0}\,{1\atop0}\right)$. The border rank cannot be↔1, since we cannot have
$a↓1(u)b↓1(u)c↓1(u)≡a↓1(u)b↓2(u)c↓2(u)≡u↑d$ and $a↓1(u)b↓2(u)c↓1(u)≡a↓1(u)b↓1(u)
c↓2(u)≡0\modulo{u↑{d+1}}$. The border rank is↔2 because of the realization
$\left({1\atop u}\,{1\atop0}\right)$, $\left({u\atop1}\,{0\atop1}\right)$,
$\left({1\atop0}\,{-1\atop\hfill u}\right)$.
\ansno 63. (a)\9 Let the elements of $T(m,n,s)$ and $T(M,N,S)$ be denoted by
$t↓{\langle i,j↑\prime\rangle\langle j,k↑\prime\rangle\langle k,i↑\prime\rangle}$
and
$T↓{\langle I,J↑\prime\rangle\langle J,K↑\prime\rangle\langle K,I↑\prime\rangle}$,
respectively. Each element
$\Tscr↓{\langle \Iscr,\Jscr↑\prime\rangle\langle \Jscr,\Kscr↑\prime\rangle\langle
\Kscr,\Iscr↑\prime\rangle}$ of the direct product, where $\Iscr=\langle i,I\rangle$,
$\Jscr=\langle j,J\rangle$, and $\Kscr=\langle k,K\rangle$, is equal to
$t↓{\langle i,j↑\prime\rangle\langle j,k↑\prime\rangle\langle k,i↑\prime\rangle}
T↓{\langle I,J↑\prime\rangle\langle J,K↑\prime\rangle\langle K,I↑\prime\rangle}$
by definition, so it is↔1 iff $\Iscr↑\prime=\Iscr$ and $\Jscr↑\prime=\Jscr$ and
$\Kscr↑\prime=\Kscr$.
(b)\9 We have $M(mns)≤r↑3$, since $T(mns,mns,mns)=T(m,n,s)\otimes T(n,s,m)\otimes
T(s,m,n)$. If $M(P)≤R$ we have $M(P↑h)≤R↑h$ for all $h$, and it follows that
$M(N)≤M(P↑{\lceil
\log↓PN\rceil})≤R↑{\lceil\log↓PN\rceil}≤RN↑{\log R/\!\log P}$.\xskip [This
result appears in \α{Pan}'s 1972 paper.]
(c)\9We have $M↓d(mns)≤r↑3$ for some $d$, where $M↓d(n)=\\↓d\biglp T(n,n,n)\bigrp$.
If $M↓d(P)≤R$ we have $M↓{hd}(P↑h)≤R↑h$ for all $h$, and the stated formula follows
since $M(P↑h)≤{hd+2\choose2}R↑h$
by exercise↔61. In an infinite field we save a factor of $\log N$.\xskip [This
result is due to \α{Bini} and \α{Sch\"onhage}, 1979.]
(d)\9Let $P=mns$; we can perform $p↑{3h}$ independent $P↑h\times P↑h$ matrix
multiplications with at most ${hd+2\choose2}p↑{3h}r↑{3h}$ noncommutative scalar
multiplications. Reduce $M(P↑{2h})$ to $M(P↑h)$ matrix multiplications of size
$P↑h\times P↑h$; thus we have $M(P↑{2h})≤{hd+2\choose2}r↑{3h}M(P↑h)\*
\biglp1+O(p/r)↑{3h}\bigrp$.
Iterating this recurrence gives $$M(N)=O(N↑{β(m,n,s,r)})\cdot\exp\biglp
O(\log\log n)↑2\bigrp.$$
\ansno 64. (a)\9 Let $a=x↓{ij}$, $A=uX↓{ki}$, $b=y↓{jk}$, $B=Y↓{ij}$, $c=uz↓{ki}$,
$C=Z↓{jk}$, so that $\Sigma↓2=O(u↑2)$ can be eliminated.\xskip[When $m=s=7$ and
$n=1$, this gives $M(N)=O(N↑{2.66})$.]\xskip(b) Take $α↓1=s-1$, $α↓2=-α↓1↑{-2}$,
$α↓3=α↓4=-1$, $α↓5=1$, $α↓6=s-1$, and $d≥4$.\xskip[We assume that $α↑{-1}$ exists
in the field.]\xskip
(c) Taking the direct product of $T(m,n,2s)\oplus T(2n,s,m)\oplus T(s,2m,n)$ with
itself $3h$ times gives a tensor whose border rank is at most $\biglp2(m+1)n
(s+2)\bigrp↑{3h}$. This tensor is the direct sum of $3↑{3h}$ terms of the
form $T\biglp m↑i(2n)↑js↑k,n↑is↑j(2m)↑k,(2s)↑im↑jn↑k\bigrp$ where $i+j+k=3h$,
and $(3h)!/h!↑3$ of these have $i=j=k=h$. Thus if we let
$P=(2mns)↑h$ and $p=(3h)!/h!↑3$,
the border rank of $pT(P,P,P)$ is at most $pr$, where$$r=\biglp2(m+1)n
(s+2)\bigrp↑{3h}/p.$$Exercise 63(d) now implies that $M(N)=O(N↑{β(P,P,P,r)+ε})$
for all $ε>0$; here $P$ and↔$r$ are functions of $h$. We complete the
proof by letting $h$ be large in $β(P,P,P,r)=\log r/\!\log P=\biglp3h\log2(m+1)n
(s+2)-3h\log3+O(\log h)\bigrp/(3h\log2mns)$,
which equals $β\biglp m,n,2s,{2\over3}(m+1)n(s+2)\bigrp
+O\biglp(\log h)/h\bigrp$.\xskip[The best value is
obtained for $m=5$, $n=1$, $s=11$, $β=3\log↓{110}52<2.522$.]
%folio 840 galley 9b (C) Addison-Wesley 1978 *
\ansbegin{4.7}
\ansno 1. Find the first nonzero coefficient
$V↓m$, as in (4), and divide both $U(z)$ and $V(z)$
by↔$z↑m$ (shifting the coefficients $m$ places to the left). The
quotient will be a power series iff $U↓0 = \cdots = U↓{m-1}
= 0$.
\ansno 2. We have
$V↑{n+1}↓{\!0}W↓n = V↑{n}↓{\!0}U↓n - (V↑{1}↓{\!0}W↓0)(V↑{n-1}↓{\!0}V↓n)
- (V↑{2}↓{\!0}W↓1)(V↑{n-2}↓{\!0}V↓{n-1}) - \cdots - (V↑{n}↓{\!0}W↓{n-1})(V↑{0}↓{\!0}
V↓1)$.
Thus, we can start by replacing $(U↓j, V↓j)$ by $(V↑{j}↓{\!0}U↓j,
V↑{j-1}↓{\!0}V↓j)$ for ${j ≥ 1}$, then set $W↓n ← U↓n - \sum ↓{0≤k<n}
W↓kV↓{n-k}$ for $n ≥ 0$, finally replace $W↓j$ by $W↓j/V↑{j+1}↓{\!0}$
for $j ≥ 0$. Similar techniques are possible in connection with other algorithms
in this section.
\ansno 3. Yes. When $α = 0$, it is easy to prove by induction
that $W↓1 = W↓2 = \cdots = 0$. When $α = 1$, we find $W↓n =
V↓n$, by the ``cute'' identity
$$\sum ↓{1≤k≤n}\left({k - (n - k)\over n}\right)V↓kV↓{n-k} = V↓0V↓n.$$
\ansno 4. If $W(z) = e↑{V(z)}$, then $W↑\prime (z)
= V↑\prime (z)W(z)$; we find $W↓0 = 1$, and
$$W↓n = \sum ↓{1≤k≤n} {k\over n} V↓kW↓{n-k},\qquad\hbox{for }n ≥ 1.$$
If $W(z) = \ln\biglp 1 + V(z)\bigrp $, then $W↑\prime
(z) + W↑\prime (z)V(z) = V↑\prime (z)$; the rule is $W↓0 = 0$,
and $W↓1 + 2W↓2z + \cdots = V↑\prime (z)/\biglp 1 + V(z)\bigrp$.
[By exercise↔6, the logarithm can be obtained
to order $n$ in $O(n\log n)$ operations.\xskip R. P. \α{Brent} observes
that $\exp\biglp V(z)\bigrp$ can also be calculated with this
asymptotic speed by applying \α{Newton's method} to $f(x) = \ln
x - V(z)$; therefore general exponentiation $\biglp 1 + V(z)\bigrp ↑α = \exp\biglp
α \ln(1 + V(z))\bigrp$ is $O(n \log n)$ too. Reference: {\sl Analytic
Computational Complexity}, ed.\ by J. F. \α{Traub} (New York: Academic
Press, 1975), 172--176.]
\ansno 5. We get the original series back again. This can be
used to test a reversion algorithm.
\ansno 6. $\varphi (x) = x + x\biglp 1 - xV(z)\bigrp $, cf.\
Algorithm↔4.3.3R\null. Thus after $W↓0$, $\ldotss$, $W↓{N-1}$ are known,
the idea is to input $V↓N$, $\ldotss$, $V↓{2N-1}$, compute ${(W↓0 + \cdots +
W↓{N-1}z↑{N-1})}\*{(V↓0 + \cdots + V↓{2N-1}z↑{2N-1})} = 1 + R↓0z↑N
+ \cdots + R↓{N-1}z↑{2N-1} + O(z↑{2N})$, and
let $W↓N + \cdots + W↓{2N-1}z↑{N-1}
= -(W↓0 + \cdots + W↓{N-1}z↑{N-1})(R↓0 + \cdots + R↓{N-1}z↑{N-1})
+ O(z↑N)$.\xskip [{\sl Numer.\ Math.\ \bf 22} (1974), 341--348; this algorithm
was, in essence, first published by M. \α{Sieveking,} {\sl Computing
\bf 10} (1972), 153--156.]\xskip Note that the total time for $N$
coefficients is $O(N \log N)$ if we use ``fast'' polynomial
multiplication (exercise 4.6.4--57).
\ansno 7. $W↓n = {mk\choose k}/n$ when $n = (m - 1)k + 1$, otherwise
0.\xskip(Cf.\ exercise 2.3.4.4--11.)
%folio 842 galley 10a (C) Addison-Wesley 1978 *
\ansno 8. Input $G↓1$ in step L1, and $G↓n$ in step L2. In
step L4, the output should be $(U↓{n-1}G↓1 + 2U↓{n-2}G↓2 + \cdots
+ nU↓0G↓n)/n$.\xskip (The running time of the order $N↑3$ algorithm
is hereby increased by only order $N↑2$. The value $W↓1 = G↓1$
might be output in step↔L1.)
{\sl Note:} Algorithms T and N determine $V↑{-1}\biglp
U(z)\bigrp$; the algorithm in this exercise determines $G\biglp
V↑{-1}(z)\bigrp $, which is somewhat different. Of course, the
results can all be obtained by a sequence of operations of reversion
and \α{composition} (exercise↔11), but it is helpful to have more
direct algorithms for each case.
\ansno 9. $\vtop{\halign{$# $⊗$\ctr{#}$\quad⊗$\ctr{#}$\quad⊗$\ctr{#}$\quad
⊗$\ctr{#}$\quad⊗$\ctr{#}$\cr
⊗n = 1⊗n = 2⊗n = 3⊗n = 4⊗n = 5\cr
\noalign{\vskip3pt}
T↓{1n}⊗1⊗1⊗2⊗5⊗14\cr
T↓{2n}⊗⊗1⊗2⊗5⊗14\cr
T↓{3n}⊗⊗⊗1⊗3⊗\99\cr
T↓{4n}⊗⊗⊗⊗1⊗\94\cr
T↓{5n}⊗⊗⊗⊗⊗\91\lower3pt\null\cr}}$
\ansno 10. Form $y↑{1/α} = x(1 + a↓1x + a↓2x↑2 + \cdotss
)↑{1/α} = x(1 + c↓1x + c↓2x↑2 + \cdotss)$ by means of Eq.\ (9);
then revert the latter series.\xskip (See the remarks following Eq.\
1.2.11.3--11.)
\ansno 11. Set $W↓0 ← U↓0$, and set $(T↓k, W↓k) ← (V↓k, 0)$
for $1 ≤ k ≤ N$. Then for $n = 1$, 2, $\ldotss$, $N$, do the following:
Set $W↓j ← W↓j + U↓nT↓j$ for $n ≤ j ≤ N$; and then set $T↓j
← T↓{j-1}V↓1 + \cdots + T↓nV↓{j-n}$ for $j = N$, $N - 1$, $\ldotss
$, $n + 1$.
Here $T(z)$ represents $V(z)↑n$. An {\sl on-line}
power series algorithm for this problem, analogous to Algorithm↔T\null,
could be constructed, but it would require about $N↑2\!/2$
storage locations. There is also an \α{on-line algorithm} that
solves this exercise and needs only $O(N)$ storage locations:
We may assume that $V↓1 = 1$, if $U↓k$ is replaced by $U↓kV↑{k}↓{\!1}$
and $V↓k$ is replaced by $V↓k/V↓1$ for all $k$. Then we may
revert $V(z)$ by Algorithm↔L\null, using its output as input to the
algorithm of exercise↔8 with $G↓1 = U↓1$, $G↓2 = U↓2$, etc., thus
computing $U\biglp (V↑{-1})↑{-1}(z)\bigrp - U↓0$.
\α{Brent} and \α{Kung} have constructed several algorithms
that are asymptotically faster. For example, we can evaluate
$U(x)$ for $x = V(z)$ by a slight variant of exercise 4.6.4--42(c),
doing about $2\sqrt{n}$ chain multiplications of cost $M(n)$
and about $n$ parameter multiplications of cost $n$, where $M(n)$ is the
number of operations needed to multiply power series to order $n$; the total
time is therefore $O\biglp \sqrt{n}M(n) + n↑2\bigrp = O(n↑2)$.
A still faster method can be based on the identity $U\biglp
V↓0(z) + z↑mV↓1(z)\bigrp = U\biglp V↓0(z)\bigrp + z↑mU↑\prime
\biglp V↓0(z)\bigrp V↓1(z) + z↑{2m}U↑{\prime\prime}\biglp V↓0(z)\bigrp
V↓1(z)↑2\!/2! + \cdotss$, extending to about $n/m$ terms, where
we choose $m \approx \sqrt{n/\!\log n}$; the first term $U\biglp
V↓0(z)\bigrp$ is evaluated in $O\biglp mn(\log n)↑2\bigrp$
operations using a method somewhat like that in exercise 4.6.4--43.
Since we can go from $U↑{(k)}\biglp V↓0(z)\bigrp$
to $U↑{(k+1)}\biglp V↓0(z)\bigrp$ in $O(n \log n)$ operations
by differentiating and dividing by $V↑\prime↓{\!0}(z)$,
the entire procedure takes $O\biglp mn(\log n)↑2 + (n/m)\,n
\log n) = O(n \log n)↑{3/2}$ operations.\xskip[{\sl JACM \bf25} (1978), 581--595.]
\ansno 12. Polynomial division is trivial unless $m ≥ n ≥ 1$.
Assuming the latter, the equation $u(x) = q(x)v(x) + r(x)$ is
equivalent to $U(z) = Q(z)V(z) + z↑{m-n+1}R(z)$ where $U(x)
= x↑mu(x↑{-1})$, $V(x) = x↑nv(x↑{-1})$, $Q(x) = x↑{m-n}q(x↑{-1})$, and
\inx{reverse of a polynomial}
$R(x) = x↑{n-1}r(x↑{-1})$ are the ``reverse'' polynomials of
$u$, $v$, $q$, and $r$.
To find $q(x)$ and $r(x)$, compute the first $m
- n + 1$ coefficients of the power series $U(z)/V(z) = W(z)
+ O(z↑{m-n+1})$; then compute the power series $U(z) - V(z)W(z)$,
which has the form $z↑{m-n+1}T(z)$ where $T(z) = T↓0 + T↓1z
+ \cdotss$. Note that $T↓j = 0$ for all $j ≥ n$; hence $Q(z)
= W(z)$ and $R(z) = T(z)$ satisfy the requirements.
\ansno 13. Apply exercise 4.6.1--3 with $u(z)=z↑N$ and
$v(z)=W↓0+\cdots+W↓{N-1}z↑{N-1}$; the desired approximations are the values of
$v↓3(z)/v↓2(z)$ obtained during the course of the algorithm. Exercise 4.6.1--26
tells us that there are no further possibilities with relatively prime
numerator and denominator. If each $W↓i$ is an integer, an all-integer
\inx{polynomial remainder sequence}
extension of Algorithm↔4.6.1C will have the desired properties.
{\sl Notes:} The case $N=2n+1$ and deg$(w↓1)=\hbox{deg}(w↓2)=n$ is of
particular interest, since it is equivalent to a so-called \α{Toeplitz} system.
The method of this exercise can be generalized to arbitrary
rational \α{interpolation} of the
form $W(z)≡p(z)/q(z)$ ${\biglp \hbox{modulo }(z-z↓1)\ldotsm(z-z↓N)\bigrp}$,
where the $z↓i$'s need not be distinct; thus, we can specify
the value of $W(z)$ and some of its derivatives at several points.
See Fred G. \α{Gustavson} and David Y. Y. \α{Yun},
IBM Research Report RC7551 (1979). % to appear
\ansno 14. If $U(z)=z+U↓kz↑k+\cdots$ and $V(z)=z↑k+V↓{k+1}z↑{k+1}+\cdotss$, we
find that the difference
$V\biglp U(z)\bigrp-U↑\prime(z)V(z)$ is $\sum↓{j≥1}z↑{2k+j-1}j\biglp
U↓kV↓{k+j}-U↓{k+j}+($polynomial involving only $U↓k$, $\ldotss$, $U↓{k+j-1}$,
$V↓{k+1}$, $\ldotss$, $V↓{k+j-1})\bigrp$; hence $V(z)$ is unique if $U(z)$ is
given and $U(z)$ is unique if $V(z)$ and $U↓k$ are given.
The solution depends on two auxiliary algorithms, the first of which
solves the equation $V\biglp z+z↑kU(z)\bigrp=\biglp 1+z↑{k-1}W(z)\bigrp V(z)+
z↑{k-1}S(z)+O(z↑{k-1+n})$ for $V(z)=V↓0+V↓1z+\cdots+V↓{n-1}z↑{n-1}$, given
$U(z)$, $W(z)$, $S(z)$, and $n$. If $n=1$, let $V↓0=-S(0)/W(0)$; or let $V↓0=1$
when $S(0)=W(0)=0$. To go from $n$ to $2n$, let
$$\eqalign{V\biglp z+z↑kU(z)\bigrp⊗=\biglp
1+z↑{k-1}W(z)\bigrp V(z)+z↑{k-1}S(z)-z↑{k-1+n}R(z)+O(z↑{k-1+2n}),\cr
1+z↑{k-1}\A W(z)⊗=\biglp z/(z+z↑kU(z))\bigrp↑n\biglp 1+z↑{k-1}W(z)\bigrp,\cr
\A S(z)⊗=\biglp z/(z+z↑kU(z))\bigrp↑nR(z),\cr
\noalign{\vskip 11pt plus 3pt minus 8pt
\hbox{and let $\A V(z)=V↓n+V↓{n+1}z+\cdots+V↓{2n-1}z↑{n-1}$ satisfy}
\vskip 11pt plus 3pt minus 8pt}
\;\A V\biglp z+z↑kU(z)\bigrp⊗=\biglp 1+z↑{k-1}\A W(z)\bigrp
\A V(z)+z↑{k-1}\A S(z)+O(z↑{k-1+n}).\cr}$$
The second algorithm solves $W(z)U(z)+zU↑\prime(z)=V(z)+O(z↑n)$ for $U(z)=U↓0+
U↓1z+\cdots+U↓{n-1}z↑{n-1}$, given $V(z)$, $W(z)$, and $n$. If $n=1$, let
$U↓0=V(0)/W(0)$, or let $U↓0=1$ in case $V(0)=W(0)=0$. To go from $n$ to $2n$, let
$W(z)U(z)+zU↑\prime(z)=V(z)-z↑nR(z)+O(z↑{2n})$, and let $\A U(z)=U↓n+\cdots+
U↓{2n-1}z↑{n-1}$ be the solution to the equation
$\biglp n+W(z)\bigrp\A U(z)+z{\A U}↑\prime(z)=R(z)
+O(z↑n)$.
Resuming the notation of (27), the first algorithm can be used to solve
$\A V\biglp U(z)\bigrp=U↑\prime(z)\biglp z/U(z)\bigrp↑k\A V(z)$ to any desired
accuracy, and we set $V(z)=z↑k\A V(z)$. To find $P(z)$, suppose we have $V\biglp
P(z)\bigrp=P↑\prime(z)V(z)+O(z↑{2k-1+n})$, an equation that holds for $n=1$
when $P(z)=z+αz↑k$ and $α$ is arbitrary.
We can go from $n$ to $2n$ by letting $V\biglp P(z)\bigrp
=P↑\prime(z)V(z)+z↑{2k-1+n}R(z)+O(z↑{2k-1+2n})$ and replacing $P(z)$ by $P(z)+
z↑{k+n}\A P(z)$, where the second algorithm is used to find the polynomial
$\A P(z)$ such that
\def\Phat{\save7\hbox{$P$}\vbox to 1ht7{\vss\hbox{$\A{\box7}$}}}
$\biglp k+n-zV↑\prime(z)/V(z)\bigrp\A P(z)+z\Phat↑\prime(z)=\biglp z↑k/V(z)\bigrp
R(z)+O(z↑n)$.
\vfill\end